16
TIP MAX Hsiang-Hsuan Hung

Hsiang hung

Embed Size (px)

Citation preview

TIPMAX

Hsiang-HsuanHung

Mo)va)on Helping taxi drivers to max their income

WebApp:TipMaxhttp://www.tipmaxnyc.xyz

DataSource

http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml

Pipeline

Flask

Batch process

Pipeline

Flask

Batch process

Problem: raw data is not ordered by time and 220GB with 13 billions events

Real-TimePipeline

Flask

Real-TimePipeline

Flask

Batch process

Engineer real-time streaming

Challenges•  Connector between Cassandra and Spark

•  Design primary keys for data query

•  Cleaning data

Challenges•  Time series forecast?

AboutMe•  UCSD, Physics PhD 2011

•  U Illinois, ECE 2011-2012

•  U Texas Austin, Physics 2012-2015

•  Computational material science:

•  Programming, travel, fitness….

HPC, e.g. quantum Monte Carlo…

Morecomplicatedqueries

•  Will passengers give higher tips during rush hours?

•  Will tips vary by payment type, years and weather, number of passengers?

•  ….....