Upload
konrad-malawski
View
2.173
Download
0
Embed Size (px)
Citation preview
Konrad `ktoso` Malawski @ Scala Matsuri 2017 @ Tokyo
Survival guide for the Streaming World
Akka-chan’s
0.7Early preview
RS started (late 2013)
2014
2016年: JDK 9 への導入決定
Reactive Streams started in late 2013,initiated by Roland Kuhn, Viktor Klang, Erik Meijer, Ben Christiansen
0.7Early preview
RS begins!
2.0Experimental
RS on way to JDK9
2014 2016
2014年: リアクティブストリーム始動2017年: 安定版。リアクティブストリーム付きの JDK9。
0.7Early preview
RS begins!
2.0Experimental
RS on way to JDK9
2.4.17 + 10.0.3Fully stable
RS in JDK9 (!)
2014 2016 2017
2014年: リアクティブストリーム始動2016年: JDK 9 への導入決定
Konrad `@ktosopl` Malawski
work: akka.io lightbend.com personal blog: http://kto.so
communities: geecon.org Java.pl / KrakowScala.pl sckrk.com GDGKrakow.pl lambdakrk.pl
Konrad `@ktosopl` Malawski
work: akka.io lightbend.com personal blog: http://kto.so
communities: geecon.org Java.pl / KrakowScala.pl sckrk.com GDGKrakow.pl lambdakrk.pl
The concurrent & distributed applications toolkit
Akka is a toolkit and runtime for building highly concurrent, distributed, and resilient message-driven applications on the JVM
2017年: 安定版。リアクティブストリーム付きの JDK9。
Actors – Concurrency / high perf. messaging Cluster – Location transparent, resilient clusters Persistence – EventSourcing support (many DBs) Distributed Data – Availability-first gossiped CRDTs HTTP – Fully Async & Reactive Http Server (Websockets, Http/2)
soon:Typed – really well-typed Actors, coming this year
…and much more (kafka, cassandra, testing, …)
Suddenly everyone jumped on the word “Stream”.
Akka Streams / Reactive Streams started end-of-2013.
“Streams”
* when put in “” the word does not appear in project name, but is present in examples / style of APIs / wording.
Suddenly everyone jumped on the word “Stream”.
Akka Streams / Reactive Streams started end-of-2013.
The word “Stream” is used in many contexts/meanings
Akka Streams Reactive Streams RxJava “streams”* Spark Streaming Apache Storm “streams”* Java Steams (JDK8) Reactor “streams”* Kafka Streams ztellman / Manifold (Clojure)
* when put in “” the word does not appear in project name, but is present in examples / style of APIs / wording.
Apache GearPump “streams” Apache [I] Streams (!) Apache [I] Beam “streams” Apache [I] Quarks “streams” Apache [I] Airflow “streams” (dead?) Apache [I] Samza Scala Stream Scalaz Streams, now known as FS2 Swave.io Java InputStream / OutputStream / … :-)
2017年: 安定版。リアクティブストリーム付きの JDK9。
“Stream” What does it mean?!
• Possibly infinite datasets (“streams”)
• “Streams are NOT collections.”
• Processed element-by-element• Element could mean “byte” • More usefully though it means a specific type “T”
• Asynchronous processing• Asynchronous boundaries (between threads)
• Network boundaries (between machines)
2017年: 安定版。リアクティブストリーム付きの JDK9。
The new “big data” is “fast data”.
Enabled by reactive building blocks.
“Fast Data”
2017年: 安定版。リアクティブストリーム付きの JDK9。
The typical “big data” architecture ever since Hadoop entered the market
Legacy Hadoop architectures would mostly look like this:
(legacy, batch oriented architectures)
典型的な「ビッグデータ」アーキテクチャ
The typical “big data” architecture ever since Hadoop entered the market
Legacy Hadoop architectures would mostly look like this:
(legacy, batch oriented architectures)
典型的な「ビッグデータ」アーキテクチャ
Fast Data – architecture example
Many of these technologies compose together into what we call Fast Data architectures.
典型的な「ビッグデータ」アーキテクチャ
Fast Data – architecture example
It’s like big-data, but FAST - which often implies various kinds of streaming.
典型的な「ビッグデータ」アーキテクチャ
Where does Akka Stream fit?
Akka Streams specifically fits,if you answer yes to any of these:
• Should it take on public traffic?
• Processing in hot path for requests?• Integrate various technologies?• Protect services from over-load?• Introspection, debugging, excellent Akka integration?• (vs. other reactive-stream impls.)
以上の質問に「はい」と答えた場合は Akka Stream に向いている
How do I pick which “streaming” I need?
Kafka serves best as a transport for pub-sub across services.
• Note that Kafka Streams (db ops are on the node) is rather, different than the Reactive Kafka client
• Great for cross-service communication instead of HTTP Request / Reply
Kafka はサービス間の pub-sub 通信に向いているHTTP の代わりにサービス間の通信に使う
How do I pick which “streaming” I need?
Spark has vast libraries for ML or join etc ops.
• It’s the “hadoop replacement”.• Spark Streaming is windowed-batches• Latency anywhere up from 0.5~1second
• Great for cross-service communication instead of HTTP Req/Reply
Spark は機械学習系が充実している
Oh yeah, there’s JDK8 “Stream” too!
Terrible naming decision IMHO, since Java’s .stream()
• Geared for collections • Best for finite and known-up-front data• Lazy, sync/async (async rarely used)• Very (!) hard to extend
It’s the opposite what we talk about in Streaming systems!
It’s more: “bulk collection operations”Also known as… Scala collections API (i.e. Iterator
JDK8 の Stream はイテレータ的なもの
What about JDK9 “Flow”?
JDK9 introduces java.util.concurrent.Flow
• Is a 1:1 copy of the Reactive Streams interfaces• On purpose, for people to be able to impl. it
• Does not provide useful implementations • Is only the inter-op interfaces • Libraries like Akka Streams implement RS,
and expose useful APIs for you to use.
JDK9 の Flow はリアクティブ・ストリーム
“Not-quite-Reactive-System” The reason we started researching into transparent to users flow control.
“Best practices are solutions to yesterdays problems.”
https://twitter.com/FrankBuytendijk/status/795555578592555008
Circuit breaking as substitute of flow-control
See also, Nitesh Kant, Netflix @ Reactive Summit https://www.youtube.com/watch?v=5FE6xnH5Lak
See also, Nitesh Kant, Netflix @ Reactive Summit https://www.youtube.com/watch?v=5FE6xnH5Lak
HTTP/1.1 503 Service Unavailable
HTTP/1.1 503 Service Unavailable
Throttling as represented by 503 responses. Client will back-off… but how?What if most of the fleet is throttling?
http://doc.akka.io/docs/akka/2.4/common/circuitbreaker.html
HTTP/1.1 503 Service Unavailable
HTTP/1.1 503 Service Unavailable
http://doc.akka.io/docs/akka/2.4/common/circuitbreaker.html
See also, Nitesh Kant, Netflix @ Reactive Summit https://www.youtube.com/watch?v=5FE6xnH5Lak
“slamming the breaks”
See also, Nitesh Kant, Netflix @ Reactive Summit https://www.youtube.com/watch?v=5FE6xnH5Lak
“slamming the breaks”
See also, Nitesh Kant, Netflix @ Reactive Summit https://www.youtube.com/watch?v=5FE6xnH5Lak
“slamming the breaks”
See also, Nitesh Kant, Netflix @ Reactive Summit https://www.youtube.com/watch?v=5FE6xnH5Lak
“slamming the breaks”
See also, Nitesh Kant, Netflix @ Reactive Summit https://www.youtube.com/watch?v=5FE6xnH5Lak
“slamming the breaks”
See also, Nitesh Kant, Netflix @ Reactive Summit https://www.youtube.com/watch?v=5FE6xnH5Lak
We’ll re-visit this specific case in a bit :-)
“slamming the breaks”
A fundamental building block. Not end-user API by itself.
reactive-streams.org
Reactive Streams
Reactive StreamsMore of an SPI (Service Provider Interface),
than API.
reactive-streams.org
No no no…! Not THAT Back-pressure!
Also known as: application level flow control.
What is back-pressure?
Reactive Streams - story: 2013’s impls
~2013:
Reactive Programming becoming widely adopted on JVM.
- Play introduced “Iteratees”- Akka (2009) had Akka-IO (TCP etc.)- Ben starts work on RxJava
http://blogs.msdn.com/b/rxteam/archive/2009/11/17/announcing-reactive-extensions-rx-for-net-silverlight.aspxhttp://infoscience.epfl.ch/record/176887/files/DeprecatingObservers2012.pdf - Ingo Maier, Martin Odersky
https://github.com/ReactiveX/RxJava/graphs/contributorshttps://github.com/reactor/reactor/graphs/contributors
https://medium.com/@viktorklang/reactive-streams-1-0-0-interview-faaca2c00bec#.69st3rndy
Teams discuss need for back-pressure in simple user API.Play’s Iteratee / Akka’s NACK in IO.
}
Reactive Streams - story: 2013’s impls
2014–2015:
Reactive Streams Spec & TCK development, and implementations.
1.0 released on April 28th 2015, with 5+ accompanying implementations.
2015Proposed to be included with JDK9 by Doug Leavia JEP-266 “More Concurrency Updates”
http://hg.openjdk.java.net/jdk9/jdk9/jdk/file/6e50b992bef4/src/java.base/share/classes/java/util/concurrent/Flow.java
Kernel does this!Routers do this!
(TCP)
Use bounded buffer, drop messages + require re-sending
Push model
Fast Publisher will send at-most 3 elements. This is pull-based-backpressure.
Reactive Streams: “dynamic push/pull”
JEP-266 – soon…!public final class Flow { private Flow() {} // uninstantiable
@FunctionalInterface public static interface Publisher<T> { public void subscribe(Subscriber<? super T> subscriber); }
public static interface Subscriber<T> { public void onSubscribe(Subscription subscription); public void onNext(T item); public void onError(Throwable throwable); public void onComplete(); }
public static interface Subscription { public void request(long n); public void cancel(); }
public static interface Processor<T,R> extends Subscriber<T>, Publisher<R> { }}
JEP-266 – soon…!public final class Flow { private Flow() {} // uninstantiable
@FunctionalInterface public static interface Publisher<T> { public void subscribe(Subscriber<? super T> subscriber); }
public static interface Subscriber<T> { public void onSubscribe(Subscription subscription); public void onNext(T item); public void onError(Throwable throwable); public void onComplete(); }
public static interface Subscription { public void request(long n); public void cancel(); }
public static interface Processor<T,R> extends Subscriber<T>, Publisher<R> { }}
Single basic (helper) implementation available in JDK:SubmissionPublisher
Reactive Streams: goals
1) Avoiding unbounded buffering across async boundaries
2)Inter-op interfaces between various libraries
Reactive Streams: goals1) Avoiding unbounded buffering across async boundaries
2) Inter-op interfaces between various libraries
Argh, implementing a correct RS Publisher or Subscriber is so hard!
1) Avoiding unbounded buffering across async boundaries
2) Inter-op interfaces between various libraries
Reactive Streams: goals
Argh, implementing a correct RS Publisher or Subscriber is so hard!
Reactive Streams: goals
Argh, implementing a correct RS Publisher or Subscriber is so hard!
You should be using Akka Streams instead!
1) Avoiding unbounded buffering across async boundaries
2) Inter-op interfaces between various libraries
Inspiring other technologies
It’s been a while since Java inspired other modern technologies, hasn’t it?
The implementation.
Complete and awesome Java and Scala APIs. As everything since day 1 in Akka.
Akka Streams
Akka Streams in 20 seconds: // types: Source[Out, Mat] Flow[In, Out, Mat] Sink[In, Mat]
// generally speaking, it's always: val ready = Source.from…(???).via(flow).map(i => i * 2).to(sink)
val mat: Mat = ready.run() // the usual example: val f: Future[String] = Source.single(1).map(i => i.toString).runWith(Sink.head)
Proper static typing!
Akka Streams in 20 seconds:
// types: _Source[Int, NotUsed] Flow[Int, String, NotUsed] Sink[String, Future[String]]
Source.single(1).map(i => i / 0).runWith(Sink.head)
Source.single(1).map(i => i.toString).runWith(Sink.head)
// types: _Source[Int, NotUsed] Flow[Int, String, NotUsed] Sink[String, Future[String]]
Akka Streams in 20 seconds:
Alpakka – a community for Stream connectors
http://developer.lightbend.com/docs/alpakka/current/
Krzysztof Ciesielski Reactive Kafka (Alpakka) maintainer
Room B @ 15:30,1日
A core feature not obvious to the untrained eye…!
Quiz time! TCP is a ______ protocol?
Akka Streams & HTTP
A core feature not obvious to the untrained eye…!
Quiz time! TCP is a STREAMING protocol!
Akka Streams & HTTP
Streaming in Akka HTTP
HttpServer as a: Flow[HttpRequest, HttpResponse]
http://doc.akka.io/docs/akka/2.4/scala/stream/stream-customize.html#graphstage-scala “Framed entity streaming” https://github.com/akka/akka/pull/20778
Streaming in Akka HTTP
HttpServer as a: Flow[HttpRequest, HttpResponse]
HTTP Entity as a: Source[ByteString, _]
http://doc.akka.io/docs/akka/2.4/scala/stream/stream-customize.html#graphstage-scala “Framed entity streaming” https://github.com/akka/akka/pull/20778
Streaming in Akka HTTP
http://doc.akka.io/docs/akka/2.4/scala/stream/stream-customize.html#graphstage-scala “Framed entity streaming” https://github.com/akka/akka/pull/20778
HttpServer as a: Flow[HttpRequest, HttpResponse]
TCP as a: Flow[ByteString, ByteString]
Websocket connection as a: Flow[ws.Message, ws.Message]
Streaming from Akka HTTP (Java)wrap(mySource): Source === apply(Source): Source def bytesToMeasurements(bytes: Source[ByteString, Any]): Source[Double, NotUsed] = { val measurementLines = bytes .via(Framing.delimiter(ByteString("\n"), maximumFrameLength = 1000)) .map(_.utf8String) measurementLines .via(CsvSupport.takeColumns(Set("LinAccX (g)", "LinAccY (g)", "LinAccZ (g)"))) .mapConcat(_.flatMap(col => Try(col.toDouble).toOption)) .mapMaterializedValue(_ => NotUsed)}
Streaming from Akka HTTP (Java)wrap(mySource): Source === apply(Source): Source def bytesToMeasurements(bytes: Source[ByteString, Any]): Source[Double, NotUsed] = { val measurementLines = bytes .via(Framing.delimiter(ByteString("\n"), maximumFrameLength = 1000)) .map(_.utf8String) measurementLines .via(CsvSupport.takeColumns(Set("LinAccX (g)", "LinAccY (g)", "LinAccZ (g)"))) .mapConcat(_.flatMap(col => Try(col.toDouble).toOption)) .mapMaterializedValue(_ => NotUsed)}
via(Flow) === Source.via(Flow) Flow.via(Flow) val bytesToMeasurements: Flow[ByteString, Double, NotUsed] = { val measurementLines = Flow[ByteString] .via(Framing.delimiter(ByteString("\n"), maximumFrameLength = 1000)) .map(_.utf8String) measurementLines .via(CsvSupport.takeColumns(Set("LinAccX (g)", "LinAccY (g)", "LinAccZ (g)"))) .mapConcat(_.flatMap(col => Try(col.toDouble).toOption)) .mapMaterializedValue(_ => NotUsed)}
Streaming from Akka HTTP (Java)wrap(mySource): Source === apply(Source): Source def bytesToMeasurements(bytes: Source[ByteString, Any]): Source[Double, NotUsed] = { val measurementLines = bytes .via(Framing.delimiter(ByteString("\n"), maximumFrameLength = 1000)) .map(_.utf8String) measurementLines .via(CsvSupport.takeColumns(Set("LinAccX (g)", "LinAccY (g)", "LinAccZ (g)"))) .mapConcat(_.flatMap(col => Try(col.toDouble).toOption)) .mapMaterializedValue(_ => NotUsed)}
via(Flow) === Source.via(Flow) Flow.via(Flow) val bytesToMeasurements: Flow[ByteString, Double, NotUsed] = { val measurementLines = Flow[ByteString] .via(Framing.delimiter(ByteString("\n"), maximumFrameLength = 1000)) .map(_.utf8String) measurementLines .via(CsvSupport.takeColumns(Set("LinAccX (g)", "LinAccY (g)", "LinAccZ (g)"))) .mapConcat(_.flatMap(col => Try(col.toDouble).toOption)) .mapMaterializedValue(_ => NotUsed)}
Streaming Deep Learning Predictions
We have demonstrated:• Superior pluggability and reusability• Fully type-safe APIs• Simple custom Stages• Advanced custom GraphStages
Understanding Systems
At Lightbend we build for usability, understanding and performance.
Remember the car?That’s only:
“highest possible performance”.
Understanding Systems
We need controllable, introspectable, operable systems.
Monitoring, tracing.Right at the core of the tech.
We <3 contributions• Easy to contribute:
• https://github.com/akka/akka/issues?q=is%3Aissue+is%3Aopen+label%3Aeasy-to-contribute • https://github.com/akka/akka/issues?q=is%3Aissue+is%3Aopen+label%3A%22nice-to-have+%28low-prio%29%22
• Akka: akka.io && github.com/akka • Reactive Streams: reactive-streams.org • Reactive Socket: reactivesocket.io
• Mailing list:• https://groups.google.com/group/akka-user
• Public chat rooms: • http://gitter.im/akka/dev developing Akka • http://gitter.im/akka/akka using Akka
What’s next for Akka?• Akka 2.5.x • Distributed Data becomes stable in 2.5 • Persistence Query becomes stable in 2.5
• Alpakka connectors • Akka HTTP/2 • Akka HTTP default backend for Play (2.6.0-M1 sic!)
• Akka Typed (sic!)
• Better AbstractActor for Java8 users • Possible multi-data centre extensions …? • …?
Free e-book and printed report. bit.ly/why-reactive
Covers what reactive actually is.Implementing in existing architectures.
Thoughts from the team that’s buildingreactive apps since more than 6 years.
Obligatory “read my book!” slide :-)
ktoso @ lightbend.com twitter: ktosopl
github: ktosoteam blog: blog.akka.io
home: akka.io myself: kto.so
Q/A