Upload
vudang
View
220
Download
6
Embed Size (px)
Citation preview
Mo2va2on
“An extra half-‐star ra2ng causes restaurants to sell out 19 percentage points more frequently (increase from 30% to 49%)”
Ques2on
• How can a restaurant point out demand of customers from a large amount of reviews?
• What latent topics exist in Yelp Restaurant Reviews?
• .… and can these provide any meaningful insights to these restaurants?
Dataset
• 5,000 Restaurants • Over 158,000 Restaurant Reviews
• Why? Ø TRENDS!!! Ø Figure out what customers care about Ø Tell restaurants what they can do beber at
Methodology
• Latent Dirichlet Alloca2on (LDA)
Ø Each document is a bag of words Ø A document covers several topics Ø A topic is responsible for each word
Latent Topics 1 star
“ Bummer, we were psyched to have a new burger place. Don't bother- we waited an hour and a half and found out that our waiter "never turned in our order”- Uh, what? We won't be back. The patio is too small and the staff is incompetent. No go!”
American Cuisine Service
Healthy
American Cuisine 2
Lunch
Decor
Loca2on
36.87% American
13.49% Service
5.27% Healthy
4.89% American
4.63% Lunch
4.64% Decor
4.64% Loca2on
LDA -‐ Expecta2on Maximiza2on p(Documents | topics, topic_distribution_for_doc) repeat until convergence:
initialize topics randomly for every document:
repeat until convergence:
update topic_distribution_for_doc calculate topics on topic_distribution_for_doc
Online LDA p(Documents | topics, topic_distribution_for_doc) repeat until convergence:
initialize topics randomly for a batch of documents:
repeat until convergence:
update topic_distribution_for_doc B’ = calculate topics on topic_distribution_for_doc
topics = (1-x)topics + x*B’
Breakdown of Hidden Topics Over All Reviews
29.95% of latent topics (8.8% of all reviews)
21.04% 13.09%
10.76%
9.42%
Results
• Predict stars per hidden topic discovered Ø Overall: 4 Ø Service: 4.5 Ø Healthiness: 3
• Proof of Concept • Temporal Insights
Joe’s Farm Grill
Unhealthy? “The side of veggie fries was literally 3 pounds of fried veggies, full of cholesterol, and way too much for any human to consume.” 3.0
Results
• Predict stars per hidden topic discovered Ø Overall: 4 Ø Service: 4.5 Ø Healthiness: 3
• Proof of Concept • Temporal Insights
Proof of Concept
• Service! median ra2ng: 3.0 median weight: 0.05228 mean ra2ng: 2.4067 mean weight: 0.3899
• There are a lot of food places where service is not reviewed at all: Ø Sandwich, Bagel, Pizza, Cafes, Fast Food, Bars, Smoothies
etc – brings high disparity
Reviews in Service
Top 25 Good Reviews Top 25 Bad Reviews
• NOTE: 45% of worst/best top 25 service is Thai
Reviews in Service Insights
• Western cuisine cares enough to stay off worst service list
• 10 2mes more men2ons of Groupon in BAD service reviews than GOOD
Worst Service Providers Explained
• Quality of food is highly correlated with service: Ø Fake, badly imitated Asian foods
• Bigram LDA Ø Great Food, Great Service Ø Bad Food, Service Bad Ø Halo Effect, Cogni2ve Bias
• Average ra2ngs of restaurants lower for bad service reviews Ø Good: 4.12, Bad: 3.46
Results
• Predict stars per hidden topic discovered Ø Overall: 4 Ø Service: 4.5 Ø Healthiness: 3
• Proof of Concept • Temporal Insights
Temporal Data
• Breakfast, Lunch, Dinner Scores Ø Average across all reviews with these hidden
subtopics
• Checkin Data Ø Determine 2mes when restaurant is most busy
Temporal Insights
• Breakfast, Lunch, Dinner Scores vs. Checkin Data
Ø Only 23% of restaurants are rated the highest during peak busy hours… (aka when they are more popular)
Ø On average, restaurants are rated 0.4 lower when they are busier