33
Understanding And Combating Link Farming In The Twitter Social Network Karimi Understanding and Combating Link Farming in the Twitter Social Network 1

Understanding and Combating Link Farming in the Twitter Social Network

Embed Size (px)

Citation preview

Page 1: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

1

Understanding and Combating Link Farming in the Twitter Social Network

Page 2: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

2

Link farming Link spam تويتر درLink farming تويتر در با مقابله هاي Link farmingمكانيزم تحليل و تويتر 40000تجزيه در اسپمرLink spammer هستند كساني چه ها مشكل ريشه با Link farmingبررسي مقابله و

Link spam با مقابله براي شده مطرح روشهاي و ها Linkايده

spam

Page 3: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

3

numerous inbound links

each circle represents a website

arrows represent links between websites

link farm is an collection of artificial highly interlinked websites

created for the sole purpose of trying to hoodwink a search engine into thinking that particular websites were more popular than they really were. 

incorrect conclusion particular websites were more popular than in reality.

Generally, search engine firms are wise to this strategy, and have developed countermeasures.

Page 4: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

4

Link farming تويتر در

any group of web sites that all hyperlink to every other site in the group created by hand

or automated programs and services

Link spam تويتر درLink spam is defined as links between pages Link spam takes advantage of link-based ranking algorithmsgives website higher rankings the more other highly ranked websites link to it

Page 5: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

5

اسپمرها اهداف200 روزه هر تويتر كننده استفاده كاربر 150ميليون

توجه جلب باعث اين كنند مي پست را پيام ميليونشود مي اسپمرها

جستجو صدر در صفحات دادن قرار

اسپمرها و جستجو بندي رتبه نيست اساسمحتوا بر فقط جستجو موتور جستجوي كاربران قدرت تاثير اجتماعي گراف در اتصال و جستجو موتور الگوريتم مي صفحه باالي رتبه بندي رتبه باعث بيشتر پيرو

) لينك) آوردن بدست با اجتماعي افزايشقدرت شود

Page 6: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

6

Link farming شبكه در و وب درتويتر اجتماعي

حدود اسپمر 4000تحليل هزار

تصادفي رنكينگ سيستم يك پيشنهاد

A B

Page 7: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

7

مربوط كارهاي

( قبيل از تكراي بندي رتبه الگوريتمHTIS)

اعتماد بندي رتبه الگوريتم اعتماد بندي رتبه معكوس هاي الگوريتم اوليه صفحات مجموعه تعيين الگوريتم...و

Page 8: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

8

تويتر در اسپم ماشين يادگيري هاي الگوريتم اسپم اتوماتيك پخش ابزارهاي ايجاد باعث تويتر شوند Link farmingكاربران مي

Link farmingروشايجاد اسپمرها از نمونه يك به نياز اسپمرها به كاربران اتصال نحوه ( شامل داده مخزن تويتر داده مخزن ميليون 54تحليل

توسط كاربران اتصال و همچنين 2اكانت و لينك ميلياردشامل كه شده پست صفحات مي 1.7شامل ميليارد

باشد(

Page 9: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

9

اسپمرها روششناسايي اكانت تعليق در رسمي سياست بر تكيه صفحه به خزنده هدايت

http://twitter.com/suspended الگوريتم اين تكرار سرويسكوتاه دو از URLاستفاده

tinyurlو bitlyعمومي ) پسورد ) حدس مصالحه

Page 10: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

10

Figure 1: Terminology for the spammer’s social neighborhood

Page 11: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

11

Figure 2: Number of spam-targets, spam-followers andtheir overlap. 82% of spam-followers overlap with the spam-targets.

Page 12: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

12

Figure 3:Number of spammerswho rank within the top K according to Pagerank

Table 1: Follower-count statistics

Page 13: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

13

Figure 3: Number of spammers (among the 41,352 identified ones) who rank within the top K according to Pagerank

Page 14: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

14

index4. ANALYSIS OF LINK FARMERS

4.1 Popular users more likely to farm links

4.2 Top link farmers are not spammers

4.3 Top link farmers are active contributors

4.4 Top link farmers are social capitalists

4.5 Summary

5. COMBATING LINK FARMING

5.1 Collusionrank

5.1.1 Collusionrank + Pagerank

5.2 Evaluating Collusionrank

Effect on rankings of spammers:Effect on rankings of social capitalists following spammers:Effect on normal users who are neither spammers nor spam-followers:

6. CONCLUSION7. REFERENCES

Page 15: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

15

4. Analysis Of Link Farmers Our goal in this section is to get a better insight

into what drives link farming in Twitter.

For this, we analyze the characteristics (network connectivity and tweeting activity)of the users who are willing to reciprocate links from arbitrary users, and their potential reasons for engaging in link farming.

Page 16: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

16

4.1 Popular users more likely to farm links Figure 5 shows how the probability of a user

reciprocating to a link from spammers varieswith the user’s indegree (number of followers).

Page 17: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

17

Figure 6: Node degree distributions of top 100K link farmers, spammers and a random sample of Twitter users. Top link farmers have very high indegree and outdegree compared to both spammers and a random population. Also, most of the top link farmers have indegree/outdegree ratios near 1.

4.2 Top link farmers are not spammers

Page 18: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

18

4.2 Top link farmers are not spammers Show the cumulative distributions of indegree.

Indegree (#followers)

Figure 6(a) Indegree

Page 19: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

19

Show the cumulative distributions of outdegree

4.2 Top link farmers are not spammers

Outdegree (#followers) Figure 6(b) Outdegree

Page 20: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

20

Show ratio of indegree to outdegree for the top link farmers and the 41,352 spammers.

4.2 Top link farmers are not spammers

Indegree / Outdegree (followers per following)

Figure 6(c) Indegree / Outdegree

Page 21: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

21

do top link farmers have : one order to two orders of magnitude higher indegree

and outdegrees than spammers, also their indegree-to-outdegree ratios are

considerably higher than those of spammers (and close to 1).

  The fact that

top link farmers exhibit very different network connectivity than spammers , further suggests that a majority of top link farmers are not spammers.

4.2 Top link farmers are not spammers

Page 22: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

22

we crawled the profile pages of the top 100,000

link farmers in July 2011.

Table 2: Characteristics from profile and activity of the Top 100,000 link farmers

4.3 Top link farmers are active contributors

Has Lists

Has Location

Changed profile theme

Profile Pic

Has URL

Has Bio  

23% 84% 84% 96% 79% 87% Top link farmers

4% 36% 40% 50% 14% 25% Random sample

Page 23: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

23

Our analysis suggests that, as compared to

random Twitter users the top link farmers :

are active users that make more heavy use of their profile information .

and explore more of the features provided by Twitter.

4.3 Top link farmers are active contributors

Page 24: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

24

Table 3: Names and extracts from Twitter account biosof 10 link farmers – the ones having most links to spammers and the highest ranked according to Pagerank.

4.3 Top link farmers are active contributors

Top 5 link farmers according toPagerank #links to spammers

Barack Obama: Obama 2012 campaign staff

Larry Wentz: Internet, Affiliate Marketing

Britney Spears: It’s Britney 

Judy Rey Wasserman:Artist, founder

NPR Politics: Political coverage and conversation

Chris Latko: Interested in tech. Will follow back

UK Prime Minister: PM’s office

Paul Merriwether: helping others, let’s talk soon

JetBlue Airways: Follow us and let us help

Aaron Lee: Social Media Manager

Page 25: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

25

In order to gain more insight into the topical expertise of top link farmers, We generated a Word-cloud : Figure 7: Word-cloud of words in the Twitter account bio of top 100,000 link farmers and a random sample .

4.3 Top link farmers are active contributors

(a) Top link farmers (b) Random users

Page 26: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

26

Also, a manual analysis of 100 randomly selected top link farmers (as described in Section 4.2) showed

that a majority of their tweets contains links to legitimate external web pages.

This is in contrast to the general Twitter population

(the random sample), who describe themselves using words such as love, life, live, music,

student, and friend, and most of whom never tweet links to external web pages.

4.3 Top link farmers are active contributors

Page 27: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

27

we now explore potential reasons for why top link farmers participate in link farming.

Specifically, we ask the following question:

what motivates legitimate, popular, and actively contributing Twitter users to indiscriminately

follow back anyone who connects to them

One simple and intuitive explanation is that these usershave similar incentives as spammers.

4.4 Top link farmers are social capitalists

Page 28: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

28

Since desire for social capital drives their link farming behavior, we call such users social capitalists.

Connect to a vast majority (over 80%) of their Network neighbors via reciprocated links.

heavily interconnect with each other to increase their mutual influence.

The Twitter sub-graph formed by the 100,000 social capitalists is densely connected with approximately 81 mil links, which implies a high network density of 0.018 (in comparison, the entire Twitter network has a density of 6.5 × 10−7).

4.4 Top link farmers are social capitalists

Page 29: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

29

4.4 Top link farmers are social capitalists

?

connect to a vast majority (over 80%) of their networkneighbors via reciprocated links.

Page 30: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

30

4.4 Top link farmers are social capitalists

318 biggest capitalists in the world | Glattfelder

Super connected

very connected

Companies

Page 31: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

31

Finally, we analyzed the influence of

the social capitalists in the network.

we computed the following three widely used metrics:

Follower-rank.

Page-rank.

Retweeted-rank.

4.4 Top link farmers are social capitalists

Page 32: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

32

we analyzed the characteristics of the link farmers.

we find that legitimate, popular, and highly active users in Twitter.

We conjectured that the motivating factor for such users might be the desire to acquire social capital and

thereby, influence. We showed evidence that these social capitalists connect with others with a similar desire to amass social capital, including each other and

spammers.

4.5 Summary

Page 33: Understanding and Combating Link Farming in the Twitter Social Network

Understanding And Combating

Link Farming In The Twitter Social Network

Karimi Kawusi

33

5. Combating Link FarmingInput: network, G; set of known spammers, S; decay factor forbiased Pagerank, Output: Collusionrank scores, cinitialize score vector d for all nodes n in G