Upload
jinchao-lin
View
204
Download
0
Embed Size (px)
Citation preview
Motivation
• Twitter represents a rich flow of information• Lack of an effective way to query the twitter• Hard to monitor interested topics at real time
Search Tweets Like a Professional
A Real Time Twitter Search Engine That Allows you to Search based on:•Keywords◦ Country◦ Language◦Negative words
Demo(http://searchyourtweet.info:5000/input)
Keep an eye on your interested topic•Express your interest, we will keep you update on the newest event•Video (https://youtu.be/GdRmXNfukos)
Data pipeline
Query Controller
Backend Database
percolator
Logic Layer Frontend
Searching database
Data Backup
Pub/Sub
PublishMatching query
Register query
searching
Real Time Monitor on Twitter◦ Implemented using ElasticSearch Percolator◦ Think it as “search in reverse”
◦ User register queries into percolator◦ Percolator match incoming documents with registered queries
◦ Challenge:◦ How to design the percolator data pipeline?◦ How to decouple the backend database with frontend server?
◦ Use publish / subscribe design pattern
Real Time Monitor Data Flow
PercolatorQuery database
Twitter database
Controller
Pub/Sub subscribe
Open channel
ChallengeBuild a high throughput real time backend data pipeline?• Use Logstash!
◦ Highly Scalable◦ Compatiblewith different sources and destination
A scalable high throughput pipelineCurrent backend pipeline
Challenge• Real time update on frontend client:• Instead of using “setInterval()” javascript function, I use “socketIO” to keep socket open between front-‐end client and flask server
• Construct ElasticSearch query• Use python requests library to query ElasticSearch
• Fine tuning on ElasticSearch
About MeM.Math, University of Waterloo◦ Field: Statistics and Machine Learning
B.S., University of Toronto◦ Field: Applied Mathematics
Data Scientist Intern, Neon Inc., San Francisco
Back-‐end Model Developer, MetricAid Inc., Toronto
Experience in Deep Learning: ◦ Convolutional Network, Recurrent Network
•OS/161 (a simplified POSIX OS)