Upload
keilah
View
25
Download
2
Embed Size (px)
DESCRIPTION
Landmark-Based User Location Inference in Social Media. Yuto Yamaguchi † , Toshiyuki Amagasa † and Hiroyuki Kitagawa † †University of Tsukuba. location-related information. Profile. Residence: Tokyo, Japan. Eating seafood !!! . I’m at Logan airport . COSN @ northeastern . - PowerPoint PPT Presentation
Citation preview
Landmark-Based User Location Inferencein Social MediaYUTO YAMAGUCHI†, TOSHIYUKI AMAGASA †
AND HIROYUKI KITAGAWA†
†UNIVERSITY OF TSUKUBA
13/10/08
COSN 2013 - Yuto Yamaguchi 1
LOCATION-RELATED INFORMATION
13/10/08
COSN 2013 - Yuto Yamaguchi 2
Eating seafood !!!
I’m at Logan airport
Profile
Residence: Tokyo, Japan
COSN @ northeastern
APPLICATIONSVarious Researches using Home Locations Outbreak Modeling [Poul+, ICWSM’12] Real-World Event Detection [Sakaki+, WWW’12] Analyzing Disasters [Mandel+, LSM’12]
Other Useful Applications Location-aware Recommender [Levandoski+, ICDE’12] Merketing, Ads Disaster Warning
13/10/08
COSN 2013 - Yuto Yamaguchi 3
OUR PROBLEMLocation profiles are not available for … 76% of Twitter users [Cheng et al., CIKM’10] 94% of Facebook users [Backstrom et al.,
WWW’10]
This reduces opportunities of location information
User Home Location Inference
13/10/08
COSN 2013 - Yuto Yamaguchi 4
USER HOME LOCATION INFERENCE Content-Based Approaches
[Cheng et al., CIKM’10] [Kinsella et al., SMUC’11] [Chandra et al., SocialCom’11]
Graph-Based Approaches [Backstrom et al., WWW’10] [Sadilek et al., WSDM’12] [Jurgens, ICWSM’13]
13/10/08
COSN 2013 - Yuto Yamaguchi 5
Our focus
GRAPH-BASED APPROACH (1/2)Basic Idea
13/10/08
COSN 2013 - Yuto Yamaguchi 6
Boston
Boston
Boston Chicago
New York Boston?
friends
GRAPH-BASED APPROACH (2/2)Closeness Assumption
13/10/08
COSN 2013 - Yuto Yamaguchi 7
Friends
Not friends
Spatially close
Spatially distant
Really close?
60% are 100km distant
CONCENTRATION ASSUMPTION
13/10/08
COSN 2013 - Yuto Yamaguchi 8
Boston
Boston?
LANDMARK
Unknown
NYChicago
LANDMARKS 13/10/08 9COSN 2013 - Yuto Yamaguchi
REQUIREMENTS Small Dispersion
Large Centrality
13/10/08
COSN 2013 - Yuto Yamaguchi 10
EXAMPLES IN TWITTER
13/10/08
COSN 2013 - Yuto Yamaguchi 11
LANDMARKS MAPPING
13/10/08
COSN 2013 - Yuto Yamaguchi 12
Red: all usersBlue: landmarks
PROPOSED METHOD 13/10/08 13COSN 2013 - Yuto Yamaguchi
OVERVIEWProbabilistic Model
Modeling
13/10/08
COSN 2013 - Yuto Yamaguchi 14
Each user has his/her location distribution
Location inference = Selecting the location with the largest probability densitylocation set
LANDMARK MIXTURE MODEL
DOMINANCE DISTRIBUTIONSpatial distribution of followers’ home locations Modeled as Gaussian
Landmarks have small covariances many followers at the center
13/10/08
COSN 2013 - Yuto Yamaguchi 15
latitude
longitude
manyfollowers
fewfollowers
LANDMARK MIXTURE MODEL (LMM)
13/10/08
COSN 2013 - Yuto Yamaguchi 16
Inferencetarget user
follow
Landmark
Non-landmark
Non-landmark
Dominancedistribution
Mixtureweight
Large weight for landmark
MIXTURE WEIGHTS
13/10/08
COSN 2013 - Yuto Yamaguchi 17
Proportional to centrality
Landmark Non-landmark
Large mixture weight Small mixture weight
CONFIDENCE CONSTRAINTIf the distribution does not have a clear peak,
we should not infer the location of that user
13/10/08
COSN 2013 - Yuto Yamaguchi 18
High precision but low recall
CENTRALITY CONSTRAINTWe can reduce the cost by ignoring non-landmarks
13/10/08
COSN 2013 - Yuto Yamaguchi 19
low cost but low recall
Inferencetarget user
follow
Landmark
Non-landmark
Non-landmark
EXPERIMENTS 13/10/08 20COSN 2013 - Yuto Yamaguchi
DATASETTwitter dataset provided by [Li et al., KDD’12] 3M users in the U.S. 285M follow edges
Geocode their location profiles for ground truth 465K users (15%) labeled users
Test set 46K users (10% of labeled users)
13/10/08
COSN 2013 - Yuto Yamaguchi 21
PERFORMANCE COMPARISON
13/10/08
COSN 2013 - Yuto Yamaguchi 22
Compared three methods LMM: our method UDI: [Li+, KDD’12] Naïve: Spatial median
EFFECT OF CONFIDENCE CONSTRAINT
13/10/08
COSN 2013 - Yuto Yamaguchi 23
p0
We can adjust the trade-off between precision and recall
EFFECT OF CENTRALITY CONSTRAINT
13/10/08
COSN 2013 - Yuto Yamaguchi 24
c0 We can adjust the trade-off between cost and recall
CONCLUSIONIntroduced the concentration assumptioninstead of widely-used closeness assumption There exist landmarks
Proposed landmark mixture model Outperforms the state-of-the-art method Confidence / Centrality constraint
Future work Other application of landmarks
Recommending landmarks or their tweets
13/10/08
COSN 2013 - Yuto Yamaguchi 25