46
A.E. Cano, A. Varga and F. Ciravegna The Oak Group, Department of Computer Science, The University of Sheffield Volatile Volatile Classification of Point of Interests based on Social Awareness Streams

Volatile Classification of Point of Interests based on Social Activity Streams

Embed Size (px)

DESCRIPTION

Location sharing services(LSS) like Foursquare, Gowalla and Face- book Places gather information from millions of users who leave trails in loca- tions (i.e. chekins) in the form of micro-posts. These footprints provide a unique opportunity to explore the way in which users engage and perceive a point of interest (POI). A POI is as a human construct which describes information about locations (e.g restaurants, cities). In this work we investigate whether the collec- tive perception of a POI can be used as a real-time dataset from which POI’s transient features can be extracted. We introduce a graph-based model for profil- ing geographical areas based on social awareness streams. Based on this model we define a set of measures that can characterise a location-based social aware- ness stream as well as act as indicators of volatile events occurring at a POI. We applied the model and measures on a dataset consisting of a collection of tweets generated at the city of Sheffield and registered over three week-ends. The model and measures introduced in this paper are relevant for design of future location-based services, real-time emergency-response models, as well as traffic forecasting. Our empirical findings demonstrate that social awareness streams not only can act as an event-sensor but also can enrich the profile of a location-entity.

Citation preview

Page 1: Volatile Classification of Point of Interests based on Social Activity Streams

A.E. Cano, A. Varga and F. Ciravegna The Oak Group, �

Department of Computer Science, �The University of Sheffield

Volatile Classification Volatile Classification of Point of Interests based on Social Awareness Streams

Page 2: Volatile Classification of Point of Interests based on Social Activity Streams

•  Introduction •  Motivation

•  Contributions

•  Related Work •  Geolattice Awareness Streams

•  Transient Semantic Classification of a POI

•  Experiments •  Conclusions and Future Work

Outline Outline

Page 3: Volatile Classification of Point of Interests based on Social Activity Streams

Introduction Introduction Social Awareness Streams

[1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.

Collection of semi-public, natural language messages produced by different users and characterised by their brevity

Page 4: Volatile Classification of Point of Interests based on Social Activity Streams

Introduction Introduction Social Awareness Streams

Used as a historical data set for profiling users’ activities and behavior [2][3][4].

Time

Murakami

Michio Kaku

Spotify

Iphone

Δt

Page 5: Volatile Classification of Point of Interests based on Social Activity Streams

Contextual Meaning of Places

Introduction Introduction

Page 6: Volatile Classification of Point of Interests based on Social Activity Streams

Introduction Motivation Point of Interest (POI)

o  Specific point location that someone may find useful or interesting.

o  Characterised by static data (e.g. name, geo-coordinates) o  Classified by the type of services it provides.

Coordinates: 51.507221, -0.1275

London, United Kingdom

Categories:   Leisure and entertainment   Literature, film and television   Museums and art galleries   Music   Sports

Source : http://en.wikipedia.org/wiki/London

Page 7: Volatile Classification of Point of Interests based on Social Activity Streams

Introduction Motivation Place

An evolving set of both communal and personal labels for potentially overlapping geometric volumes.

[5] Hightower, J.: From Position to Place. In: Proc. of the 2003 Workshop on Location-Aware Computing. (2003) 10–12 part of the 2003 Ubiquitous Computing Conf.

A meaningful place can capture the venues’ demographical, environmental, historical, personal or commercial significance.  

Page 8: Volatile Classification of Point of Interests based on Social Activity Streams

Introduction Motivation Social Awareness Streams and POIs

Can social awareness streams convey information for inducing a transient classification of a point of interest?

Page 9: Volatile Classification of Point of Interests based on Social Activity Streams

Introduction Motivation Point of Interest (POI)

Tourism

History

Fashion Shopping

Literature

Shopping

History

Literature

Tourism

Fashion

Page 10: Volatile Classification of Point of Interests based on Social Activity Streams

Introduction Motivation Point of Interest (POI)

Tourism

History

Fashion Shopping

Literature

Shopping

History

Literature

Tourism

Fashion

Tags: looting, unrest, police..

Page 11: Volatile Classification of Point of Interests based on Social Activity Streams

Introduction Motivation Point of Interest (POI)

Tourism

History

Fashion Shopping

Literature

Shopping

History

Literature

Tourism

Fashion

Tags: looting, unrest, riots, police..

Social Crisis

Violence

Politics Uprising

Page 12: Volatile Classification of Point of Interests based on Social Activity Streams

Related Work Related Work Use of SAS for Tagging Places

•  Cheng et al [6]: Spatial distribution of words in tweets for predicting users’ location. Their probabilistic framework can identify words in tweets with strong geo-scope.

•  Cheng et al [7]: Mobility patterns of users in LSS. They correlate social status, geographic and economic factors with mobility and perform sentiment analysis of posts for deriving unobserved context between people and location.

•  Lin et al [8]: Taxonomy of ‘place naming methods’, showing that a person’s perceived familiarity with a place and the variety of people who visit it, strongly influence the way people refer to this place when interacting with others.

Page 13: Volatile Classification of Point of Interests based on Social Activity Streams

Related Work Related Work Use of SAS for Tagging Places

•  Ireson and Ciravegna [9]: Toponym resolution (i.e. allocation of specific geo-location to target location terms) using Flickr data. Using geo-tag media, users’ contacts, content as features, they build an SVM classifier for learning location labels associated to a location term.

Page 14: Volatile Classification of Point of Interests based on Social Activity Streams

Rather than identifying words for tagging places we study how location-based generated content can be modelled for discovering topics or categories that classify a place in time. For this we present:

 GeoLattice Awareness Streams Model  Transient Semantic Classification of a POI

Related Work Related Work Contrasting our Work

Page 15: Volatile Classification of Point of Interests based on Social Activity Streams

A Geolattice Awareness Stream is a sequence of tuples

S:= (Poiq1,Mq2,Rq3,Y,ft)

GeoLattice Aware GeoLattice Awareness Streams

Definition 1.

Page 16: Volatile Classification of Point of Interests based on Social Activity Streams

A Geolattice Awareness Stream is a sequence of tuples

S:= (Poiq1,Mq2,Rq3,Y,ft)

GeoLattice Aware GeoLattice Awareness Streams

Definition 1.

Name Geographical-bounding area Geo-coordinates

Page 17: Volatile Classification of Point of Interests based on Social Activity Streams

A Geolattice Awareness Stream is a sequence of tuples

S:= (Poiq1,Mq2,Rq3,Y,ft)

GeoLattice Aware GeoLattice Awareness Streams

Definition 1.

source

Page 18: Volatile Classification of Point of Interests based on Social Activity Streams

A Geolattice Awareness Stream is a sequence of tuples

S:= (Poiq1,Mq2,Rq3,Y,ft)

GeoLattice Aware GeoLattice Awareness Streams

Definition 1.

Keywords Categories Hashtags

Page 19: Volatile Classification of Point of Interests based on Social Activity Streams

Given a GeoLattice Awareness Stream, a POI awareness stream can be defined as the sequence of tuples from S where:

GeoLattice Aware GeoLattice Awareness Streams

POI Awareness Stream

Page 20: Volatile Classification of Point of Interests based on Social Activity Streams

GeoLattice Aware GeoLattice Awareness Streams

Steel

Education

Football

Industrial Revolution

Gothic Revival

Modern Architecture

Supertram Urban Zone

Archeology

Bath

Politics

Thames

Westminster

POI Awareness Stream

UK  

Page 21: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

How can we derive a temporal classification (Rcat) of a POI given it’s POI Awareness Stream (S(Poi’)) ?

Problem

Page 22: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach

Retrieve Messages from a POI Awareness Stream for a window of time [ts-te]

Page 23: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach- Message Enrichment

Retrieve Messages from a POI Awareness Stream for a window of time [ts-te]

Enrich Messages

Page 24: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach- Message Enrichment

Retrieve Messages from a POI Awareness Stream for a window of time [ts-te]

Enrich Messages

Page 25: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach- Semantic Categorisation

Retrieve Messages from a POI Awareness Stream for a window of time [ts-te]

Enrich Messages Semantic

Categorisation

Page 26: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach- Semantic Categorisation

Retrieve Messages from a POI Awareness Stream for a window of time [ts-te]

Enrich Messages Semantic

Categorisation

Page 27: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach- Induce Category Function

Retrieve Messages from a POI Awareness Stream for a window of time [ts-te]

Enrich Messages Semantic

Categorisation

Induce Category Function

Page 28: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach- Induce Category Function

Category Entropy of a stream  

Indicates the topical diversity of the stream.  

Low category entropy levels reveal that a stream is dominated by few categories, while a high category balance reveals a higher topical diversity.  

Page 29: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach- Induce Category Function

Category Entropy of a stream

Normal Conditions Normal Conditions Broken (Events Happening)

Restaurant

Food

Italian Cuisine CE low

City

CE high

Restaurant

CE Increases

Food

Italian Cuisine

City

CE Decreases

Page 30: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach- Induce Category Function

Category Entropy of a stream

Normal Conditions Normal Conditions Broken (Events Happening)

Restaurant

Food

Italian Cuisine CE low

City

CE high

Restaurant

CE Increases

Food

Italian Cuisine

City

CE Decreases

Page 31: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach- Induce Category Function

Mutual Information  

Measures the information that two discrete random variables share. In this work we consider the MI between:

Categories- Hashtags(MI)  

Hashtags- Keywords(MI)  

Categories- Keywords(MI)  

Page 32: Volatile Classification of Point of Interests based on Social Activity Streams

Transient Semant Transient Semantic Classification of a POI

Approach- Induce Category Function

Mutual Information  

Measures the information that two discrete random variables share. In this work we consider the MI between:

Categories- Hashtags(MI)  

Hashtags- Keywords(MI)  

Categories- Keywords(MI)  

Page 33: Volatile Classification of Point of Interests based on Social Activity Streams

Enpirical Study Empirical Study

Data Set

We monitored tweets generated from the city of Sheffield, during 3 weekends.

Common – 2011-06-10 to 2011-06-13 Food Festival – 2011-07-08 to 2011-07-11 Tramlines Music Festival – 2011-07-22 to 2011-07-25

Page 34: Volatile Classification of Point of Interests based on Social Activity Streams

Enpirical Study Empirical Study

Data Set

After enriching the content of the tweets and querying DBpedia we got the following hashtags, resources, and categories:

Page 35: Volatile Classification of Point of Interests based on Social Activity Streams

Enpirical Study Empirical Study

Data Set

The ten top more frequent hashtags in the data sets were :

Page 36: Volatile Classification of Point of Interests based on Social Activity Streams

Results Results

Category Entropy

Category entropy for each category of each of the three data set.

Page 37: Volatile Classification of Point of Interests based on Social Activity Streams

Results Results

Category Entropy

Category entropy for each category of each of the three data set.

Page 38: Volatile Classification of Point of Interests based on Social Activity Streams

Results Results

Categories with lowest Category Entropy

Page 39: Volatile Classification of Point of Interests based on Social Activity Streams

Results Results

Categories with lowest Category Entropy

Page 40: Volatile Classification of Point of Interests based on Social Activity Streams

Results Results

Categories with lowest Category Entropy

Event involving 2012 Olympic Tickets sales

Event involving Spanish Football

Page 41: Volatile Classification of Point of Interests based on Social Activity Streams

Results Results

Categories with lowest Category Entropy

Page 42: Volatile Classification of Point of Interests based on Social Activity Streams

Results Results

Deriving Context with MI

To provide a context in which the a category is used, we used the mutual information between categories and hashtags, from which we obtained a set of hashtags used to further derive related keywords (using Hashtag-Keywords MI).

Page 43: Volatile Classification of Point of Interests based on Social Activity Streams

Results Results

Highlights

A point of interest considered as a Location-Entity presents the “meformer” and “informer” patterns observed by Naaman et al.[9] in Person-Entity activity streams.

 Location-Entity Meformer pattern refers to a self focus of a Location-Entity, presenting information about endemic events.

 Location-Entity Informer pattern refers to an information sharing about external events, not necessarily related to this Location-Entity.

Page 44: Volatile Classification of Point of Interests based on Social Activity Streams

•  A formalisation for describing geographically bounded social awareness streams.

•  An approach to derive transient categorisations of Points of Interest based on DBPedia Categories.

Conclusions Conclusions and Future Work

We have presented :

Future Work:

•  Determining a semantic coherence metric which could induce broader category clusters.

Page 45: Volatile Classification of Point of Interests based on Social Activity Streams

References References

[1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM. [2] C. Wagner and M. Strohmaier. The wisdom in tweetonomies: Acquiring latent conceptual structures from social awareness streams. In Proc. of the Semantic Search 2010 Workshop (SemSearch2010), april 2010. [3] F.Abel,Q.Gao,G.Houben,andK.Tao.Semanticenrichmentoftwitterpostsforuserprofile construction on the social web. In In proceedings of Extended Semantic Web Conference 2011, May 2011. [4] A. Cano, S. Tucker, and F. Ciravegna. Capturing entity-based semantics emerging from personal awareness streams. In Proceedings of the Workshop on Making Sense of Microposts (MSM2011), May 2011.

Page 46: Volatile Classification of Point of Interests based on Social Activity Streams

References References

[6] Hightower, J.: From Position to Place. In: Proc. of the 2003 Workshop on Location-Aware Computing. (2003) 10–12 part of the 2003 Ubiquitous Computing Conf. [7] Z. Cheng, J. Caverlee, and K. Lee. You are where you tweet: a content-based approach to geo-locating twitter users. In Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM ’10, pages 759–768, New York, NY, USA, 2010. ACM. [8] K. L. D. Z. S. Zhiyuan Cheng, James Caverlee. Exploring millions of footprints in locationsharing services. In 5th International Conference on Weblogs and Social Media (ICWSM), ICWSM ’11. ACM, 2011. [9] J. Lin, G. Xiang, J. I. Hong, and N. Sadeh. Modeling people’s place naming preferences inlocation sharing. In Proceedings of the 12th ACM international conference on Ubiquitous computing, Ubicomp ’10, pages 75–84, New York, NY, USA, 2010. ACM. [10] N. Ireson and F. Ciravegna. Toponym resolution in social media. In Proc., 9th InternationalSemantic Web Conference. ISWC 2010, 2010.