33
Kamesh Munagala (Duke) Reza Bosagh Zadeh (Stanford) Ashish Goel (Stanford) Aneesh Sharma(Twitter, Inc.) On the Precision of Social and Information Networks

On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Kamesh Munagala (Duke)

Reza Bosagh Zadeh (Stanford) Ashish Goel (Stanford)

Aneesh Sharma(Twitter, Inc.)

On the Precision of Social and Information Networks

Page 2: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Information Networks � Social Networks play an important

role in information dissemination �  Emergency events, product launches, sports

updates, celebrity news,…

� Their effectiveness as information dissemination mechanisms is a source of their popularity

Page 3: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

A Fundamental Tension Two conflicting characteristics in social networks:

� Diversity: Users are interested in diverse content � Broadcast: Users disseminate information via posts/

tweets – these are blunt broadcast mechanisms!

Page 4: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Running Example

Adam interested in •  Apple •  Rap music •  Lakers

Bob tweets about: •  Christianity •  DC Politics •  Bulls

Charlie tweets about: •  Jay-Z •  Lady Gaga •  Kobe

Page 5: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Running Example

Adam interested in •  Apple •  Rap music •  Lakers

Bob tweets about: •  Christianity •  DC Politics •  Bulls

Charlie tweets about: •  Jay-Z •  Lady Gaga •  Kobe

Follow Follow

Page 6: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

A Fundamental Tension Two conflicting characteristics in social networks:

� Diversity: Users are interested in diverse content � Broadcast: Users disseminate information via posts/

tweets – these are blunt broadcast mechanisms!

Precision: Do users receive a lot of un-interesting content?

Recall: Do users miss a lot of potentially interesting content?

Page 7: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Question we study

Can information networks have high precision and recall?

Page 8: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Case Study: Twitter �  A random tweet is uninteresting to a random user…

�  … but users have interests and follow others based on these

Information networks like Twitter are constructed according to users’ interests!

Page 9: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Revisiting our example…

Adam interested in •  Apple •  Rap music •  Lakers

Bob tweets about: •  Christianity •  DC Politics •  Bulls

Charlie tweets about: •  Jay-Z •  Lady Gaga •  Kobe

Follow

Page 10: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Small User Study on Twitter

!"

!#$"

!#%"

!#&"

!#'"

!#("

!#)"

!#*"

!#+"

!#,"

$"

$" %" &" '" (" )" *" +" ," $!"

!"#$%&%'()

*&#")(+,-#")

*&#"."/0#1)2"#$%&%'()'3)04##0&)

-./01.20"

34256/"

Page 11: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Roadmap �  Assumptions:

1.  Users have immutable interests (independent of the network) 2.  Choose to connect to other users based on their interests 3.  Step (2) is optimized for precision and recall

Page 12: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Roadmap �  Assumptions:

1.  Users have immutable interests (independent of the network) 2.  Choose to connect to other users based on their interests 3.  Step (2) is optimized for precision and recall

�  Question 1: What conditions on the structure of user interests are necessary for high precision and recall, and small dissemination time?

�  Question 2: Can we empirically validate these conditions as well as the conclusion on Twitter?

Page 13: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

User-Interest Model �  Set of interests I; Set of users U

�  Each interest i is associated with two sets of users: � Producers P(i) = Users who tweet about i � Consumers C(i) = Users who are interested in i

� Denote the mapping from users to interests as Q(I,U)

�  Assume: P(i) µ C(i) for all interests i

Page 14: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Example Revisited

P(a) = {q} C(a) = {q, r, s}

P(b) = {s} C(b) = {r, s, t}

P(c) = {q, t} C(c) = {q, r, s, t}

User a

User b User c

Page 15: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Social (user-user) Graph G(U,E)

Social graph

P(a) = {q} C(a) = {q, r, s}

P(b) = {s} C(b) = {r, s, t}

P(c) = {q, t} C(c) = {q, r, s, t}

User a

User b User c

User a receives interests R(a) = {q, t}

Page 16: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

PR Score PR(u) = Precision and recall score for user u

- Function of user-interest map Q(I,U) and social graph G(U,E) PR(u) =

R(u)!C(u)R(u)"C(u)

Interests u receives from its followees

The consumption interests of u

Page 17: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Example

Social graph

P(a) = {q} C(a) = {q, r, s}

P(b) = {s, t} C(b) = {r, s, t}

P(c) = {q, t} C(c) = {q, s, t}

User a

User b User c

R(a) = {q, t} C(a) = {q, r, s} PR(a) = ¼ = 0.25

Page 18: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Improved Score

Social graph

P(a) = {q} C(a) = {q, r, s}

P(b) = {s, t} C(b) = {r, s, t}

P(c) = {q, t} C(c) = {q, s, t}

User a

User b User c

R(a) = {q, s, t} C(a) = {q, r, s} PR(a) = 2/4 = 0.5

Page 19: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

α-PR User-Interest Maps Q(I,U)

A user-interest map Q(I,U) is α-PR if: There exists a social graph G(U,E) s.t.

all users u have PR-Score ≥α

Special case: 1-PR means that R(u) = C(u) for all users u

Page 20: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Necessary Conditions for 1-PR �  Condition 1:

If Q(I,U) is “non-trivial” and G(U,E) is (strongly) connected: Then P(i) ½ C(i) for some interest i

�  Informal implication: Users have broader consumption interests and narrower

production interests

Page 21: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Experimental Setup � Classify text of tweets using 48 topics

� Yields “topic distribution” for each user �  Entropy of distribution lies between 0 and log2(48) = 3.87

�  P(u)= Interest distribution in tweets produced by u �  C(u) = Interest distribution in URL clicks made by u

Page 22: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Verifying Condition 1

TYPE OF INTEREST DISTRIBUTION

AVERAGE SUPPORT

AVERAGE ENTROPY

Consumption 7.78 2.00

Production 3.96 1.24

Page 23: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Can Interests be chosen at Random? Different interests can have different “participation levels” Theorem: If users choose P production and C consumption interests at random preserving participation levels of the interests then:

With high probability the interest structure is not α-PR for any constant α

Technically needs:

�  n = |U| and |I| = m > n1/2 �  P = logδn forδ > 2 and C < n1/3 �  Bounded second moment of participation level distribution

Key proof idea: Q(I,U) behaves like an expander graph

Page 24: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

● ●

●●●

●●

● ●

●●

● ●

●●

1

2

3

4

5

6

7

8

9

10 11

12

13

14

15

161718

19

20

21 22

23

24

25

26

27

28

29

30

31 32

3334

35

36

37

38

39

40

41 42

4344

45

46

47

Condition 2: Interests have Clustered Structure

Share more users than is predicted by a random assortment

Sports/Games

Technology

Page 25: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Interest Structure achieving 1-PR Kronecker graph model

User u

Kobe Gaga Obama

Attributes/Dimensions

Y N M N

d = O(log n) dimensions K = O(log n) values Lakers

Page 26: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Interest Structure achieving 1-PR Kronecker graph model

User u

Kobe Gaga Obama

Attributes/Dimensions

Y N M N

d = O(log n) dimensions K = O(log n) values

Similarity graph on values Y M N

Y 1 1 0

M 1 1 1

N 0 1 1 Y M N

Lakers

Page 27: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Interest Structure

Interest i

Kobe Gaga Obama

Attributes/Dimensions

Y M

Lakers

Similarity graph on values

Y M N

Y * M *

M * N *

Producer

Consumer

N * M * Not interested

Agrees exactly on all relevant dimensions

Set of relevant dimensions + their values

Similar on all relevant dimensions

Page 28: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

User-user Graph [Leskovec, Chakrabarti, Kleinberg, Faloutsos, Ghahramani ‘10]

User a

Kobe Gaga Obama

Attributes/Dimensions

Y M

Lakers Similarity graph on values

Y M N

Y M N Y

M N M M

User b

User c

Y M

Edge

Edge

Undirected Edge between two users iff ALL dimensions are similar

Such graphs can have: •  Super-constant average degree •  Heavy tailed degree distributions •  Constant diameter

Page 29: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Main Positive Result �  The Kronecker interest structure has 100% PR!

�  Users only receive interesting information

�  Users receive all information they are interested in

�  The dissemination time is constant.

Page 30: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Empirical Study of Precision

Precision(u) =C(u)!P(v)

(u,v)"E#

P(v)(u,v)"E#

0.0 0.2 0.4 0.6 0.8 1.0Precision per user

0

100

200

300

400

500

600

Freq

uenc

y

Precision distribution

Median precision = 40% Baseline precision = 17%

Interpretation: One in 2.5 interests received on any follow edge are interesting

Page 31: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Caveat: This is only a first step!

� Measuring interests � Use URL clicks as a measure of consumption/relevance � Use 48 topics as proxy for interests � Not considered quality of tweets in measuring interest � Not explored structure of interests in great detail

� Empirical validation � User studies are more reliable, but our study is small � We have not measured recall or dissemination time

Page 32: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Open Questions � Better empirical measures of interests and PR?

�  In-depth analysis of structure of interests � How can recall be measured?

� Can high PR information networks arise in a decentralized fashion? � How can users discover high PR links?

Page 33: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social