Norikra in Action (ver. 2014 spring)

Preview:

DESCRIPTION

筑波大学 集中講義資料 2014/01/31

Citation preview

Norikra in ActionTAGOMORI Satoshi (@tagomoris)

LINE Corp.2014/01/31 (Fri) at University of Tsukuba

the 2nd half

14年1月31日金曜日

Norikra:Schema-less Stream Processing with SQL

Open source software (GPLv2)

http://norikra.github.io/

https://github.com/norikra/norikra

14年1月31日金曜日

Norikra:Schema-less event stream:

Add/Remove data fields whenever you wantSQL:

No more restarts to add/remove queriesw/ JOINs, w/ SubQueriesw/ UDF

Truly Complex events:Nested Hash/Array, accessible directly from SQL

14年1月31日金曜日

Norikra Queries: (1)

SELECT name, ageFROM events

target

14年1月31日金曜日

Norikra Queries: (1)

SELECT name, ageFROM events

{“name”:”tagomoris”, “age”:34, “address”:”Tokyo”, “corp”:”LINE”, “current”:”Tsukuba”}

{“name”:”tagomoris”,”age”:34}

14年1月31日金曜日

Norikra Queries: (1)

SELECT name, ageFROM events

nothing

{“name”:”tagomoris”, “address”:”Tokyo”, “corp”:”LINE”, “current”:”Tsukuba”}

14年1月31日金曜日

Norikra Queries: (2)

SELECT name, ageFROM events

WHERE current=”Tsukuba”

{“name”:”tagomoris”,”age”:34}

{“name”:”tagomoris”, “age”:34, “address”:”Tokyo”, “corp”:”LINE”, “current”:”Tsukuba”}

14年1月31日金曜日

Norikra Queries: (2)

SELECT name, ageFROM events

WHERE current=”Tsukuba”

nothing

{“name”:”kawashima”, “age”:99, “address”:”Tsukuba”, “corp”:”Univ”, “current”:”Dream”}

14年1月31日金曜日

Norikra Queries: (3)

SELECT age, COUNT(*) as cntFROM events.win:time_batch(5 mins)

GROUP BY age

14年1月31日金曜日

Norikra Queries: (3)

SELECT age, COUNT(*) as cntFROM events.win:time_batch(5 mins)

GROUP BY age

{”age”:34,”cnt”:3}, {“age”:33,”cnt”:1}, ...

every 5 mins

{“name”:”tagomoris”, “age”:34, “address”:”Tokyo”, “corp”:”LINE”, “current”:”Tsukuba”}

14年1月31日金曜日

Norikra Queries: (4)

SELECT age, COUNT(*) as cntFROM

events.win:time_batch(5 mins)GROUP BY age

{”age”:34,”cnt”:3}, {“age”:33,”cnt”:1}, ...

SELECT max(age) as maxFROM

events.win:time_batch(5 mins)

{“max”:51}

{“name”:”tagomoris”, “age”:34, “address”:”Tokyo”, “corp”:”LINE”, “current”:”Tsukuba”}

every 5 mins14年1月31日金曜日

Norikra Queries: (5)

SELECT age, COUNT(*) as cntFROM events.win:time_batch(5 mins)

GROUP BY age

{“name”:”tagomoris”, “user:{“age”:34, “corp”:”LINE”, “address”:”Tokyo”}, “current”:”Tsukuba”, “speaker”:true, “attend”:[true,true,false, ...]}

14年1月31日金曜日

Norikra Queries: (5)

SELECT user.age, COUNT(*) as cntFROM events.win:time_batch(5 mins)

GROUP BY user.age

{“name”:”tagomoris”, “user:{“age”:34, “corp”:”LINE”, “address”:”Tokyo”}, “current”:”Tsukuba”, “speaker”:true, “attend”:[true,true,false, ...]}

14年1月31日金曜日

Norikra Queries: (5)

SELECT user.age, COUNT(*) as cntFROM events.win:time_batch(5 mins)

WHERE current=”Tsukuba”AND attend.$0 AND attend.$1

GROUP BY user.age

{“name”:”tagomoris”, “user:{“age”:34, “corp”:”LINE”, “address”:”Tokyo”}, “current”:”Kyoto”, “speaker”:true, “attend”:[true,true,false, ...]}

14年1月31日金曜日

Norikra and EsperEsper:

CEP engine library, Java, GPLv2EPL: Event Processing Language (SQL + window)Streams: schema-full flat field set

Norikra:Using Esper internallySchema-less stream -> schema-full stream conversionRewriting compiled queries

14年1月31日金曜日

Norikra query execution1.accept query2.parse -> find target / field set3.(if target is not opened) wait for first event4.compile query5.rewrite target name into stream name6.rewrite field names7.register query8.input events

14年1月31日金曜日

Target mapping

Ignore unused fields

Field set matching between streams and queries

Generate field set inheritance tree

14年1月31日金曜日

automated stream inheritanceof norikra's target

Base fieldset

Query fieldset

Data fieldset

b_xxxxxxxxx

minimal fieldset definition:

name: 'string'id: 'long'

valid: 'boolean'action_type: 'string'

14年1月31日金曜日

Base fieldset

Query fieldset

Data fieldset

automated stream inheritanceof norikra's target

b_xxxxxxxxx

event data fieldset definition:name: 'string'

id: 'long'valid: 'boolean'

action_type: 'string'product_code: 'string'

charge: 'integer'shop_code: 'long'

e_xxxxxxxx1

14年1月31日金曜日

Base fieldset

Query fieldset

Data fieldset

automated stream inheritanceof norikra's target

b_xxxxxxxxx

e_xxxxxxxx1 e_xxxxxxxx2

event data fieldset definition:name: 'string'

id: 'long'valid: 'boolean'

action_type: 'string'product_code: 'string'

charge: 'integer'shop_code: 'long'affiliate: 'string'

14年1月31日金曜日

Base fieldset

Query fieldset

Data fieldset

automated stream inheritanceof norikra's target

b_xxxxxxxxx

e_xxxxxxxx1 e_xxxxxxxx2

new query:SELECT count(*)

FROM target.win:time_batch(1min)WHERE affiliate.length() > 0

14年1月31日金曜日

Base fieldset

Query fieldset

Data fieldset

automated stream inheritanceof norikra's target

b_xxxxxxxxx

e_xxxxxxxx1 e_xxxxxxxx2'

event data fieldset definition:name: 'string'

id: 'long'valid: 'boolean'

action_type: 'string'affiliate: 'string'

q_xxxxxxxx0

new query:SELECT count(*)

FROM target.win:time_batch(1min)WHERE affiliate.length() > 0

14年1月31日金曜日

Base fieldset

Query fieldset

Data fieldset

automated stream inheritanceof norikra's target

b_xxxxxxxxx

e_xxxxxxxx1 e_xxxxxxxx2'

q_xxxxxxxx0

Registered EPL:SELECT count(*)

FROM q_xxxxxxxx0.win:time_batch(1min)WHERE affiliate.length() > 0

14年1月31日金曜日

Base fieldset

Query fieldset

Data fieldset

automated stream inheritanceof norikra's target

b_xxxxxxxxx

e_xxxxxxxx1' e_xxxxxxxx2'

q_xxxxxxxx0

e_xxxxxxxx3'

q_xxxxxxxx1

14年1月31日金曜日

Query rewriting{“name”:”tagomoris”, “user:{“age”:34, “corp”:”LINE”, “address”:”Tokyo”}, “current”:”Kyoto”, “speaker”:true, “attend”:[true,true,false, ...]}

SELECT user.age, COUNT(*) as cntFROM events.win:time_batch(5 mins)

WHERE current=”Tsukuba”AND attend.$0 AND attend.$1

GROUP BY user.age

14年1月31日金曜日

{“name”:”tagomoris”, “user:{“age”:34, “corp”:”LINE”, “address”:”Tokyo”}, “current”:”Kyoto”, “speaker”:true, “attend”:[true,true,false, ...]}

SELECT user.age, COUNT(*) as cntFROM events.win:time_batch(5 mins)

WHERE current=”Tsukuba”AND attend.$0 AND attend.$1

GROUP BY user.age

14年1月31日金曜日

{“name”:”tagomoris”, “user:{“age”:34, “corp”:”LINE”, “address”:”Tokyo”}, “current”:”Kyoto”, “speaker”:true, “attend”:[true,true,false, ...]}

{“name”:”tagomoris”, “user.age”:34, “user.corp”:”LINE”, “user.address”:”Tokyo”, “current”:”Kyoto”, “speaker”:true, “attend.$0”:true, “attend.$1”:true, “attend.$2”:false, ...}

14年1月31日金曜日

{“name”:”tagomoris”, “user.age”:34, “user.corp”:”LINE”, “user.address”:”Tokyo”, “current”:”Kyoto”, “speaker”:true, “attend.$0”:true, “attend.$1”:true, “attend.$2”:false, ...}

{“user$age”:34, “current”:”Kyoto”, “attend$$0”:true, “attend$$1”:true,}

14年1月31日金曜日

{“user$age”:34, “current”:”Kyoto”, “attend$$0”:true, “attend$$1”:true,}

SELECT user.age, COUNT(*) as cntFROM events.win:time_batch(5 mins)

WHERE current=”Tsukuba”AND attend.$0 AND attend.$1

GROUP BY user.age

14年1月31日金曜日

{“user$age”:34, “current”:”Kyoto”, “attend$$0”:true, “attend$$1”:true,}

SELECT user$age, COUNT(*) as cntFROM events.win:time_batch(5 mins)

WHERE current=”Tsukuba”AND attend$$0 AND attend$$1

GROUP BY user$age

Conversions for compiled query object in fact.

SELECT user.age, COUNT(*) as cntFROM events.win:time_batch(5 mins)

WHERE current=”Tsukuba”AND attend.$0 AND attend.$1

GROUP BY user.age

14年1月31日金曜日

Norikra internal: Jump from schema-full world to schema-less world.

14年1月31日金曜日

Recommended