56
Detecting Trends Stanislav Nikolov §,† Devavrat Shah § §

Detecting Trends

Embed Size (px)

DESCRIPTION

Stanislav Nikolov (MIT, Twitter) Devavrat Shah (MIT) Interdisciplinary Workshop on Information and Decision in Social Networks 2012

Citation preview

Page 1: Detecting Trends

Detecting Trends!Stanislav Nikolov §,† Devavrat Shah §

§ †

Page 2: Detecting Trends

Source: http://twoinformcanada.ca/wp-content/uploads/2012/07/barclays.jpg

Page 3: Detecting Trends

Source: http://twoinformcanada.ca/wp-content/uploads/2012/07/barclays.jpg

Page 4: Detecting Trends

The Barclays Libor scandal #

12:49: “#Barclays” is listed as a trending topic on Twitter

Page 5: Detecting Trends

•  Is there enough information before the “jump”?

Page 6: Detecting Trends

•  Is there enough information before the “jump”?

•  Can we predict which topics will trend in advance?

Page 7: Detecting Trends

Yes.

Page 8: Detecting Trends

(best parameter setting)

•  79% early detection •  1.43 hours mean early detection •  95% TPR, 4% FPR.

Page 9: Detecting Trends

What are Trending Topics? •  Twitter: a global communication network.

Page 10: Detecting Trends

What are Trending Topics? •  Twitter: a global communication network. •  Tweet: a short, public message.

Page 11: Detecting Trends

What are Trending Topics? •  Twitter: a global communication network. •  Tweet: a short, public message.

•  Topic: a phrase in a tweet.

Page 12: Detecting Trends

What are Trending Topics? •  Twitter: a global communication network. •  Tweet: a short, public message.

•  Topic: a phrase in a tweet. •  Trending topic (a “trend”): a topic that

becomes popular.

Page 13: Detecting Trends

A Parametric Model •  Expect certain type of pattern (e.g.

constant + jumps).

time

activ

ity

Page 14: Detecting Trends

A Parametric Model •  Expect certain type of pattern (e.g.

constant + jumps). •  Fit parameters to data (e.g. how much of

a jump).

time

activ

ity

Page 15: Detecting Trends

A Parametric Model •  Expect certain type of pattern (e.g.

constant + jumps). •  Fit parameters to data (e.g. how much of

a jump).

time

activ

ity

p = 0.1

Page 16: Detecting Trends

A Parametric Model!•  Expect certain type of pattern (e.g.

constant + jumps). •  Fit parameters to data (e.g. how much of

a jump).

time

activ

ity

p = 0.6

Page 17: Detecting Trends

A Parametric Model!•  Expect certain type of pattern (e.g.

constant + jumps). •  Fit parameters to data (e.g. how much of

a jump).

time

activ

ity

p = 4.1

Page 18: Detecting Trends

A Parametric Model!•  Expect certain type of pattern (e.g.

constant + jumps). •  Fit parameters to data (e.g. how much of

a jump). •  Decide if jump is big enough.

trend detected!

time

activ

ity

p = 4.1

Page 19: Detecting Trends

Parametric Models are Inadequate!

trend detected!

time

activ

ity

Page 20: Detecting Trends

Parametric Models are Inadequate!

trend detected!

time

activ

ity

Page 21: Detecting Trends

Parametric Models are Inadequate!

trend detected!

time

activ

ity

Page 22: Detecting Trends

Parametric Models are Inadequate!

trend detected!

time

activ

ity

Page 23: Detecting Trends

A Data-Driven Approach •  All of the information is in the data.

Page 24: Detecting Trends

A Data-Driven Approach •  All of the information is in the data. •  Hypothesis

Page 25: Detecting Trends

A Data-Driven Approach!•  All of the information is in the data. •  Hypothesis – Tweets are written by people.

Page 26: Detecting Trends

A Data-Driven Approach •  All of the information is in the data. •  Hypothesis – Tweets are written by people. – People are simple.

Page 27: Detecting Trends

A Data-Driven Approach!•  All of the information is in the data. •  Hypothesis – Tweets are written by people. – People are simple.

•  In how they spread information.

Page 28: Detecting Trends

A Data-Driven Approach!•  All of the information is in the data. •  Hypothesis – Tweets are written by people. – People are simple.

•  In how they spread information. •  In how they connect to one another.

Page 29: Detecting Trends

A Data-Driven Approach!•  All of the information is in the data. •  Hypothesis – Tweets are written by people. – People are simple.

•  In how they spread information. •  In how they connect to one another.

– Small number of distinct “ways” in which a topic can become trending.

Page 30: Detecting Trends
Page 31: Detecting Trends
Page 32: Detecting Trends
Page 33: Detecting Trends
Page 34: Detecting Trends
Page 35: Detecting Trends
Page 36: Detecting Trends

Classification by Experts

Page 37: Detecting Trends

Classification by Experts!

s

observation

Page 38: Detecting Trends

Classification by Experts!

s r

observation

Page 39: Detecting Trends

Classification by Experts!

s r

vote

observation

Page 40: Detecting Trends

Classification by Experts!

s r

vote

observation

Page 41: Detecting Trends

Classification by Experts!

s r

vote

observation

Page 42: Detecting Trends

Classification by Experts!

s r

vote

observation

Page 43: Detecting Trends

Classification by Experts!

s r

vote

observation

Page 44: Detecting Trends

Classification by Experts!

s r

vote

observation

Page 45: Detecting Trends

Classification by Experts!

s r

observation

Page 46: Detecting Trends

Properties •  Simple (just compute distances) •  Scalable (can compute distances in

parallel) •  Non-parametric – model “parameters”

scale with the data

Page 47: Detecting Trends

Experimental Results

Page 48: Detecting Trends

Experiment •  500 trends. •  500 non-trends. •  Do trend detection on a 50% hold out set. •  Online signal classification.

Page 49: Detecting Trends

Results – Early Detection

(best parameter setting)

Page 50: Detecting Trends

Results – FPR / TPR Tradeoff

Page 51: Detecting Trends

Results – Early / Late Tradeoff

Page 52: Detecting Trends

Concluding Remarks •  Algorithm to detect trends early •  Scalable nonparametric time series

analysis

Page 53: Detecting Trends

Concluding Remarks •  Algorithm to detect trends early •  Scalable nonparametric time series

analysis

classification

Page 54: Detecting Trends

Concluding Remarks •  Algorithm to detect trends early •  Scalable nonparametric time series

analysis

classification anomaly detection

Page 55: Detecting Trends

Concluding Remarks •  Algorithm to detect trends early •  Scalable nonparametric time series

analysis

prediction classification anomaly detection

Page 56: Detecting Trends

Concluding Remarks •  Algorithm to detect trends early •  Scalable nonparametric time series

analysis

prediction classification anomaly detection