Upload
smartnews-inc
View
15.375
Download
2
Embed Size (px)
Citation preview
Data Engineering In Practice: SmartNews Ads裏のDMP System
Lan
Who am I• Lan
• Veteran hacker but new in AD world
• someone who can make a computer do what he wants—whether the computer wants to or not. (http://paulgraham.com/gba.html)
• ex-{Rakuten, GREE}
• Distribution System, Info Retrieval, ML
Today’s Talk
• DMP in SmartNews Ads
• #1. Prediction
• #2. Targeting
• Future Work & Summary
DMP = Data Management Platform
DMP in SmartNews Ads• Private DMP ( 90%+1st-party data )
• Data Collect, Clean, Aggregation
• ID Mapping
• User Profiling
• User Clustering
• CTR / CVR Prediction
• Lookalike
• Custom Audience
DMPClusters
AD delivery cluster
AD Log in S3
Kinesis
AD tracker
Video AD delivery cluster
DMPstreaming
Audience Data
in DynamoDBRDB
Hadoop
ML
Analytics
Models&
Targeting
SmartNewsLog
ML
Small company but not small data
•Article Meta > 200K/day •Article x {read, share, read_related …} •Channel x {subscribe, preview, view, …} •Push, Live, Weather, Setting, … •Survey result
•Audience Data > 14M (~5M MAU)
•AD Meta •AD History •AD Conversions •AD Optout
• Managed/Compressed Data > 130TB
• Lookalike seeds
• ~1TB Data for training CTR prediction model •> 1M unique features
•User Demographics •Device •Locations •…
#1 Prediction
Pick up an ADto feed here
Similar to Recommendation
but DIFFERENT
• optimization goal • accuracy of the probability
More than Ranking • When we do AD auction
• eCPM (effective Cost per Mille) = CTR (Click Through Rate) x CPC (Cost per Click)
• Suppose we have
• CTRad1=0.05 > CTRad2=0.04 > CTRad3=0.03
• CPCad1 = 10JPY, CPCad2 = 13JPY, CPCad3 = 20JPY(winner)
• but if: pCTRad1 = 0.2 (winner) > pCTR’ad2 = 0.1 > pCTR’ad3 = 0.03
• then we lost 0.1JPY potential income
The CTR(CVR) prediction Problem
μ(a, u, c) = p(click | a,u,c)
CTR Prediction v1• Train and scoring daily
• One GBDT (Gradient Boosting Decision Tree) model per AD campaign
• using ~1month’s data
• Hundreds of small batches inside Hadoop Yarn
• Quick and Simple
• dev in 1 month
• pick up best features for every campaign
• minutes ~ 1 hour for model training
• explainable Tree models
• no need for AD feature
• Same approach for CVR prediction (CPC / CVR = CPA (Cost Per Acquisition) )
delivery result
UserFeatures
generatesamples
Yarn
Users
predictions
sample
model
scoring
sample
model
scoring
sample
model
scoring
…
Metrics• NE (Normalized Cross- Entropy)
• the average log loss when using predicted CTR / the average log loss per impression
• https://facebook.com//download/321355358042503/adkdd_2014_camera_ready_junfeng.pdf
• AUC (Area under the ROC curve, AUROC)
• measure ranking quality
• others: Precision/Recall, ECS(Effective catalog size), CTR / CVR / Sales, etc
Review of CTR Prediction v1• Marked improvement, moderate AUC & NE
• And
• hard to do overall tuning
• hard to prediction online (feature set differs)
• latency for new campaigns
• relatively poor performance to new campaigns (cold start)
• lost the connections between campaigns even for the same advertiser
• …
CTR Prediction v2• A simple model for all
• AD feature added
• Dynamic features extraction
• All calculation distributed
• GBDT + LogisticRegression
• Train once per day, scoring twice
About the Features• >1M unique features, sparse
• GBDT provides great feature engineering
• (sometimes) feature engineering is kind of intuition and trial-and-error
• demographic, device, location, reading interests…
• AD history is helpful
• Feature Hashing, Binarization & Discretization, …
Performance improvement
#2 Targeting
Watabe
TamTam
Komiya
Takei
Ikeishi
Nagase
Lan
Niku
Game
Beer
Snack
Costume
Gourmet
Princess
It’s difficult comparing to
Profiling User by Statistics and ML
• Gender Prediction (precision: 0.90+), Age Prediction, …
• News Channel / Source Preference
• AD Slot Preference
• …
Standard Targeting
• Female in Kansai who subscribes Travel Channel
Lookalike Targeting
Lookalike Targeting• Our solution
• Solve it as an classification problem
• Seed user as Positive Sample
• While all targeting candidates as Negative Sample (w/ random sampling )
• based on Spark MLlib Logistic Regression
• 30%~50% CVR↑ comparing to normal targeting
Article Keyword TargetingKeyword
Realtime Calculating Reach UU
Only user who exceeds a certain
read-time threshold will be included
Custom Audience
SmartNewsAD
tracker
Send any custom event(S2S req, web beacon, etc)
EventAudience
BloomFilter Obj
Updatingper
Several Minutes
YourService / App / Site
SmartNewsAD
DeliveryCluster
AD targeting/
Delete Targeting
Lookalike
Lookalike Targeting
Future Work
Targeting Audience by Interests
Collect Negative Signal to
Optimize UX
Summary of My 1st SmartNews Year
• Challenge place. We’re startup so we can move quick and break things
• Learn from the industry leaders. Keep trial-and-error.
• Number don’t lie. Don’t trust your intuition over number.
• But if you really doubt the number, look closely. there may be BUG hidden.