[db tech showcase Tokyo 2016] E22: Getting real time Oracle data into Kafka and unlocking the data...

Preview:

Citation preview

© 2016 Dbvisit Software | dbvisit.com

     

© 2016 Dbvisit Software | dbvisit.com

Dbvisit  So*ware  Real-­‐2me  Oracle  Database  Streaming  into  Ka9a    Chris  Lawless    

© 2016 Dbvisit Software | dbvisit.com

     

© 2016 Dbvisit Software | dbvisit.com

Agenda  

•  Oracle  OLTP  •  Evolu2on  of  data  warehouses  •  Data  Lake  •  Intro  to  Ka9a  -­‐  what  need  does  it  fill?      •  Marriage  of  the  two  

© 2016 Dbvisit Software | dbvisit.com

About  Dbvisit  So*ware    •  Real-time Oracle Database Streaming software solutions

•  In the Cloud | Hybrid | On-Premise

•  New Zealand-based, US office, Asia sales office, EU office (Prague)

•  Unique offering: disaster recovery solutions for Oracle Standard Edition

•  Low cost Oracle GoldenGate alternative

•  Flexible licensing, pricing models available

•  Peerless customer support

© 2016 Dbvisit Software | dbvisit.com

Result:  1,100+  customers  in  6  con2nents  

© 2016 Dbvisit Software | dbvisit.com

About  Chris  

巨人  

© 2016 Dbvisit Software | dbvisit.com

About  Chris  •  4 Years at Oracle University teaching DBA courses

•  5 Years at GoldenGate Support and Product Management

•  4 Years at Oracle GoldenGate Product Management

•  Past 3 years at Dbvisit

© 2016 Dbvisit Software | dbvisit.com

     

© 2016 Dbvisit Software | dbvisit.com

The  World  we  live  in  

The  Situa2on:    

ü  The  enterprise  is  increasingly  powered  by  data  ü  OLTP  transac2onal  data  essen2al  ü  The  use  of  real-­‐2me  data  for  compe22ve  advantage  is  disrup2ng  most  

industries  ü  Tradi2onal  databases  are  not  going  away,  new  database  technologies  are  

being  added  ü  Con2nuous  replica2on  data  streams  becoming  a  “first  class  ci2zen”  

 

 

© 2016 Dbvisit Software | dbvisit.com

     

© 2016 Dbvisit Software | dbvisit.com

Reality  of  RDBMS  

RDBMS    

ü  Millions  of  Oracle  databases  out  there  ü  OLTP  databases  are  ingrained  in  the  business  ü  Pervasive  ü  ERPs  ü  CRMs  

 

© 2016 Dbvisit Software | dbvisit.com

     

© 2016 Dbvisit Software | dbvisit.com

OLTP  

RDBMS  

ü  MySQL    #1  leader  in  databases  ü  MSSQL    #1  leader  is  sold  ü  IBM  DB2  #1  in  most  installs  ü  Oracle  #1  in  most  sales  ü  Oracle  is  reported  to  have  over  50%  of  all  RDBMS  sales  ü  Oracle  is  here  to  stay  

© 2016 Dbvisit Software | dbvisit.com

OLTP  Structured  Data  •  Nice  and  structured  •  Columns  •  Rows  •  Rela2onships  

© 2016 Dbvisit Software | dbvisit.com

OLTP  systems  •  Banking  •  Online  shopping  •  Stock  Markets  •  Healthcare  •  ERP  Systems  •  Customer  Rela2ons  Management  

© 2016 Dbvisit Software | dbvisit.com

OLTP  systems  

OLTP  Database  

© 2016 Dbvisit Software | dbvisit.com

OLTP  systems  with  Data  Warehouses    Old  school  

•  OLTP  systems  typically  will  feed  Data  Warehouses  via  Batch  jobs  •  Banking  statements  that  get  mailed  monthly  •  Sales  analysis  on  what  was  sold  last  month  •  Repor2ng  on  ERP  systems  •  Quarterly  Financial  reports  

© 2016 Dbvisit Software | dbvisit.com

OLTP  with  Data  Warehouse  

Batch  ETL  Process  

Data  Warehouse  database  

OLTP  Database  

© 2016 Dbvisit Software | dbvisit.com

OLTP  systems  with  Data  Warehouses      REAL-­‐TIME  

•  Online  Shopping  with  INSTANT  emails  regarding  your  shopping  habits  •  ERP  systems  with  INSTANT  informa2on  regarding  current  sales  •  Online  Banking  with  access  to  years  of  historical  data  

© 2016 Dbvisit Software | dbvisit.com

OLTP  with  Data  Warehouse  

Real-­‐2me  Streaming  

Data  Warehouse  database  

OLTP  Database  

© 2016 Dbvisit Software | dbvisit.com

The  concept  of  Data  Lake  or  Data  Reservoir  Not  all  data  is  structured    

•  What  about  IOT  data?  •  What  about  machine  data?  •  What  about  log  data?  •  Semi  Structured  data?  

© 2016 Dbvisit Software | dbvisit.com

The  new  concept  of  Data  Lake  or  Data  Reservoir  •  A  Data  Lake  is  storage  to  hold  vast  amounts  of  RAW  data  that  is  typically  

kept  in  the  na2ve  format  •  O*en  using  huge  unstructured  nodes  •  Hadoop  is  the  frequent  repository  of  choice  

© 2016 Dbvisit Software | dbvisit.com

The  new  concept  of  Data  Lake  or  Data  Reservoir  

Machine  Data   loT   Web  

logs  

Applica2on  logs  

Streaming  Web  Data   Other  

OLTP  Database  

OLTP  Database  

ETL  

Real-­‐2me  Streaming  

Data  Lake  

© 2016 Dbvisit Software | dbvisit.com

Ka9a  a  brief  History  

•  Open  Sourced  in  2011  •  Developed  at  Linkedin  and  then  ‘released  to  the  world’  as  part  of  Apache  

Founda2on.  •  These  guys  spun  off  to  form  Confluent  

-­‐  Ka9a  Connect.    A  framework  which  makes  it  simple  to  define  connectors  to  move  data  in  and  out  of  Ka9a  

•  Key  features:  -­‐  Simple  API  for  producers  and  consumers  -­‐  High  Throughput  -­‐  Scaled  out  Architecture  -­‐  Non  formaeed  messages  

© 2016 Dbvisit Software | dbvisit.com

Intro  to  Ka9a  What  is  Ka9a?  

A  distributed  system  where  messages  are  kept  in  topics  that  are  par22oned  and  replicated  across  mul2ple  nodes.    Message  Simply  put…  the  data    Messages  can  be  in  any  format:  Common  ones  are  String,  JSON,  Avro  

© 2016 Dbvisit Software | dbvisit.com

Intro  to  Ka9a  Topics  One  or  more  Par22ons  that  are  ordered  sequences  of  messages.    Producers    (Publishers)  Produce  data  to  one  or  more  topics    Consumers  (Subscribers)  Subscribe  to  topics  and  process  the  messages      

© 2016 Dbvisit Software | dbvisit.com

Old  method  Source   Target  

Target  

Target  

Ka9a  Source  

Source  

Target  

Target  

Target  

Source  

Source  

Source  

© 2016 Dbvisit Software | dbvisit.com

Intro  to  Ka9a  Producer  

Producer  

Producer  

Consumer  

Consumer  

Consumer  

Ka9a  

© 2016 Dbvisit Software | dbvisit.com

Ka9a  Par22on  0   Par22on  1   Par22on  2  

Old  

New  

© 2016 Dbvisit Software | dbvisit.com

Ka9a  •  Ka9a  treats  each  topic  par22on  as  a  log  (a  sequen2al  ordered  set  of  

messages)  

•  You  can  call  Ka9a  a  log  reader  and  a  log  writer  

© 2016 Dbvisit Software | dbvisit.com

Ka9a  •  Log  compac2on/log  reten2on  

•  Ka9a  Streams  –  the  new  stuff  from  Confluent  -­‐  No  need  for  Spark  or  other  tools  -­‐  Pure  streaming  of  the  data  -­‐  process  data  “on  the  fly”  -­‐  Ka9a  0.10.0  

© 2016 Dbvisit Software | dbvisit.com

Marriage  of  two  worlds  •  If  we  mix  the  ‘old  world’  log  readers  with  the  new  world  log  readers  and  

writers.  •  Blended  technology  

-­‐  Using  the  Oracle  logical  replica2on  tools  with  Ka9a  as  the  message  broker  

-­‐  Oracle  becomes  ‘just  another  feed  for  Ka9a’  

© 2016 Dbvisit Software | dbvisit.com

Oracle  Redo  logs  •  Reading  the  Oracle  redo  logs  is  not  easy.    Oracle  doesn’t  really  publish  the  

API.  •  Because  of  this  replica2on  companies  have  ‘sprung  up’  around  the  moving  of  

Oracle  data.  

© 2016 Dbvisit Software | dbvisit.com

Who  can  do  this?  

© 2016 Dbvisit Software | dbvisit.com

Logical  Replica2on  to  Ka9a  high  level  overview  

JSON  

THL  

© 2016 Dbvisit Software | dbvisit.com

Way  of  the  (New)  World  

© 2016 Dbvisit Software | dbvisit.com

Key  Concepts  Real-­‐Time  Data/Event  Streaming  •  A  con2nuous  flow  of  instantaneous  data  with  as  close  to  zero  latency  as  possible.  

 Real-­‐Time  Stream  Processing  •  Systems  that  con2nuously  process  incoming  data,  and  will  con2nue  to  process  that  

incoming  data  un2l  the  applica2on  is  stopped,  rather  than  opera2ng  on  a  fixed  set  of  data.    •  Indica2ve  use  cases:  

-­‐  Financial  Trading  -­‐  Real-­‐2me  System  Monitoring  -­‐  Business  Intelligence  -­‐  Real-­‐2me  Analy2cs  

© 2016 Dbvisit Software | dbvisit.com

Automo2ve:  Ka9a  Streaming    OLTP  and  Ka9a  

Streaming  data  that  can  be  USED  as  it  moves      

•  Weather  

•  Tolls  

•  Sensor  

•  Mileage  data  

•  Tire  pressure  

•  GPS  

© 2016 Dbvisit Software | dbvisit.com

Healthcare:  Ka9a  Streaming      OLTP  and  Ka9a  

•  Prescrip2ons  

•  Insurance  

•  Medical  devices  

•  Medical  history  

•  etc  

© 2016 Dbvisit Software | dbvisit.com

⾁肉  

© 2016 Dbvisit Software | dbvisit.com

     

© 2016 Dbvisit Software | dbvisit.com

Thank  you    

Q  &  A    

Recommended