Upload
alvin-porter
View
215
Download
1
Embed Size (px)
Citation preview
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
1
Understanding and Combating Link Farming in the Twitter Social Network
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
2
Link farming Link spam تويتر درLink farming تويتر در با مقابله هاي Link farmingمكانيزم تحليل و تويتر 40000تجزيه در اسپمرLink spammer هستند كساني چه ها مشكل ريشه با Link farmingبررسي مقابله و
Link spam با مقابله براي شده مطرح روشهاي و ها Linkايده
spam
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
3
numerous inbound links
each circle represents a website
arrows represent links between websites
link farm is an collection of artificial highly interlinked websites
created for the sole purpose of trying to hoodwink a search engine into thinking that particular websites were more popular than they really were.
incorrect conclusion particular websites were more popular than in reality.
Generally, search engine firms are wise to this strategy, and have developed countermeasures.
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
4
Link farming تويتر در
any group of web sites that all hyperlink to every other site in the group created by hand
or automated programs and services
Link spam تويتر درLink spam is defined as links between pages Link spam takes advantage of link-based ranking algorithmsgives website higher rankings the more other highly ranked websites link to it
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
5
اسپمرها اهداف200 روزه هر تويتر كننده استفاده كاربر 150ميليون
توجه جلب باعث اين كنند مي پست را پيام ميليونشود مي اسپمرها
جستجو صدر در صفحات دادن قرار
اسپمرها و جستجو بندي رتبه نيست اساسمحتوا بر فقط جستجو موتور جستجوي كاربران قدرت تاثير اجتماعي گراف در اتصال و جستجو موتور الگوريتم مي صفحه باالي رتبه بندي رتبه باعث بيشتر پيرو
) لينك) آوردن بدست با اجتماعي افزايشقدرت شود
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
6
Link farming شبكه در و وب درتويتر اجتماعي
حدود اسپمر 4000تحليل هزار
تصادفي رنكينگ سيستم يك پيشنهاد
A B
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
7
مربوط كارهاي
( قبيل از تكراي بندي رتبه الگوريتمHTIS)
اعتماد بندي رتبه الگوريتم اعتماد بندي رتبه معكوس هاي الگوريتم اوليه صفحات مجموعه تعيين الگوريتم...و
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
8
تويتر در اسپم ماشين يادگيري هاي الگوريتم اسپم اتوماتيك پخش ابزارهاي ايجاد باعث تويتر شوند Link farmingكاربران مي
Link farmingروشايجاد اسپمرها از نمونه يك به نياز اسپمرها به كاربران اتصال نحوه ( شامل داده مخزن تويتر داده مخزن ميليون 54تحليل
توسط كاربران اتصال و همچنين 2اكانت و لينك ميلياردشامل كه شده پست صفحات مي 1.7شامل ميليارد
باشد(
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
9
اسپمرها روششناسايي اكانت تعليق در رسمي سياست بر تكيه صفحه به خزنده هدايت
http://twitter.com/suspended الگوريتم اين تكرار سرويسكوتاه دو از URLاستفاده
tinyurlو bitlyعمومي ) پسورد ) حدس مصالحه
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
10
Figure 1: Terminology for the spammer’s social neighborhood
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
11
Figure 2: Number of spam-targets, spam-followers andtheir overlap. 82% of spam-followers overlap with the spam-targets.
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
12
Figure 3:Number of spammerswho rank within the top K according to Pagerank
Table 1: Follower-count statistics
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
13
Figure 3: Number of spammers (among the 41,352 identified ones) who rank within the top K according to Pagerank
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
14
index4. ANALYSIS OF LINK FARMERS
4.1 Popular users more likely to farm links
4.2 Top link farmers are not spammers
4.3 Top link farmers are active contributors
4.4 Top link farmers are social capitalists
4.5 Summary
5. COMBATING LINK FARMING
5.1 Collusionrank
5.1.1 Collusionrank + Pagerank
5.2 Evaluating Collusionrank
Effect on rankings of spammers:Effect on rankings of social capitalists following spammers:Effect on normal users who are neither spammers nor spam-followers:
6. CONCLUSION7. REFERENCES
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
15
4. Analysis Of Link Farmers Our goal in this section is to get a better insight
into what drives link farming in Twitter.
For this, we analyze the characteristics (network connectivity and tweeting activity)of the users who are willing to reciprocate links from arbitrary users, and their potential reasons for engaging in link farming.
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
16
4.1 Popular users more likely to farm links Figure 5 shows how the probability of a user
reciprocating to a link from spammers varieswith the user’s indegree (number of followers).
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
17
Figure 6: Node degree distributions of top 100K link farmers, spammers and a random sample of Twitter users. Top link farmers have very high indegree and outdegree compared to both spammers and a random population. Also, most of the top link farmers have indegree/outdegree ratios near 1.
4.2 Top link farmers are not spammers
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
18
4.2 Top link farmers are not spammers Show the cumulative distributions of indegree.
Indegree (#followers)
Figure 6(a) Indegree
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
19
Show the cumulative distributions of outdegree
4.2 Top link farmers are not spammers
Outdegree (#followers) Figure 6(b) Outdegree
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
20
Show ratio of indegree to outdegree for the top link farmers and the 41,352 spammers.
4.2 Top link farmers are not spammers
Indegree / Outdegree (followers per following)
Figure 6(c) Indegree / Outdegree
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
21
do top link farmers have : one order to two orders of magnitude higher indegree
and outdegrees than spammers, also their indegree-to-outdegree ratios are
considerably higher than those of spammers (and close to 1).
The fact that
top link farmers exhibit very different network connectivity than spammers , further suggests that a majority of top link farmers are not spammers.
4.2 Top link farmers are not spammers
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
22
we crawled the profile pages of the top 100,000
link farmers in July 2011.
Table 2: Characteristics from profile and activity of the Top 100,000 link farmers
4.3 Top link farmers are active contributors
Has Lists
Has Location
Changed profile theme
Profile Pic
Has URL
Has Bio
23% 84% 84% 96% 79% 87% Top link farmers
4% 36% 40% 50% 14% 25% Random sample
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
23
Our analysis suggests that, as compared to
random Twitter users the top link farmers :
are active users that make more heavy use of their profile information .
and explore more of the features provided by Twitter.
4.3 Top link farmers are active contributors
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
24
Table 3: Names and extracts from Twitter account biosof 10 link farmers – the ones having most links to spammers and the highest ranked according to Pagerank.
4.3 Top link farmers are active contributors
Top 5 link farmers according toPagerank #links to spammers
Barack Obama: Obama 2012 campaign staff
Larry Wentz: Internet, Affiliate Marketing
Britney Spears: It’s Britney
Judy Rey Wasserman:Artist, founder
NPR Politics: Political coverage and conversation
Chris Latko: Interested in tech. Will follow back
UK Prime Minister: PM’s office
Paul Merriwether: helping others, let’s talk soon
JetBlue Airways: Follow us and let us help
Aaron Lee: Social Media Manager
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
25
In order to gain more insight into the topical expertise of top link farmers, We generated a Word-cloud : Figure 7: Word-cloud of words in the Twitter account bio of top 100,000 link farmers and a random sample .
4.3 Top link farmers are active contributors
(a) Top link farmers (b) Random users
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
26
Also, a manual analysis of 100 randomly selected top link farmers (as described in Section 4.2) showed
that a majority of their tweets contains links to legitimate external web pages.
This is in contrast to the general Twitter population
(the random sample), who describe themselves using words such as love, life, live, music,
student, and friend, and most of whom never tweet links to external web pages.
4.3 Top link farmers are active contributors
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
27
we now explore potential reasons for why top link farmers participate in link farming.
Specifically, we ask the following question:
what motivates legitimate, popular, and actively contributing Twitter users to indiscriminately
follow back anyone who connects to them
One simple and intuitive explanation is that these usershave similar incentives as spammers.
4.4 Top link farmers are social capitalists
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
28
Since desire for social capital drives their link farming behavior, we call such users social capitalists.
Connect to a vast majority (over 80%) of their Network neighbors via reciprocated links.
heavily interconnect with each other to increase their mutual influence.
The Twitter sub-graph formed by the 100,000 social capitalists is densely connected with approximately 81 mil links, which implies a high network density of 0.018 (in comparison, the entire Twitter network has a density of 6.5 × 10−7).
4.4 Top link farmers are social capitalists
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
29
4.4 Top link farmers are social capitalists
?
connect to a vast majority (over 80%) of their networkneighbors via reciprocated links.
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
30
4.4 Top link farmers are social capitalists
318 biggest capitalists in the world | Glattfelder
Super connected
very connected
Companies
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
31
Finally, we analyzed the influence of
the social capitalists in the network.
we computed the following three widely used metrics:
Follower-rank.
Page-rank.
Retweeted-rank.
4.4 Top link farmers are social capitalists
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
32
we analyzed the characteristics of the link farmers.
we find that legitimate, popular, and highly active users in Twitter.
We conjectured that the motivating factor for such users might be the desire to acquire social capital and
thereby, influence. We showed evidence that these social capitalists connect with others with a similar desire to amass social capital, including each other and
spammers.
4.5 Summary
Understanding And Combating
Link Farming In The Twitter Social Network
Karimi Kawusi
33
5. Combating Link FarmingInput: network, G; set of known spammers, S; decay factor forbiased Pagerank, Output: Collusionrank scores, cinitialize score vector d for all nodes n in G