Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
BitTorrent DarknetsChao Zhang, Prithula Dhungel, Di Wu, Zhengye Liu and Keith W. Ross
Woonhak Kang2010. 11. 04VLDB Lab.
2
Contents
• Introduction• BitTorrent (Background)
§ Architecture and Term.§ Public and Private torrent sites
• Overview of BitTorrent Darknets Operation• Analysis
§ Macroscopic § Medium-scopic § Microscopic
• Conclusion
SKKU VLDB Lab.
3
Introduction
• Darknet§ 비공개토런트사이트(private torrent sites)§ 가입자에게만공개
§ 초대(inviatation), 사이트임시가입기간에가입§ 사용자의 upload, download 크기를기록
- up/down 비율을통해사용자의이용제한- up/down 비율이높은유저에게혜택
• Motivation§ 연구분야에서큰주목을받지못했다.§ 독특한정책때문에공개토런트와특성이다르다.§ 토런트전체시스템의이해를위해서는공개/비공개모두를고려할필요가있다
SKKU VLDB Lab.
4
Introduction
• Analysis§ Macroscopic
- 800개이상의비공개토런트분석- Sharky list 와 Alexa rank 이용- 전체토런트파일, 유저, 피어(peer) 정보분석
§ Medium-scopic- 4개의인기비공개토런트분석- 트랙커(trackers), 피어(peer), 유저, 실제공유파일분석- 공개사이트와비공개사이트간의상관관계
§ Microscopic- HDChina 분석- 유저의 up/down 기록, 활동시간조사
SKKU VLDB Lab.
5
Contents
• Introduction• BitTorrent (Background)
§ Architecture and Term.§ Public and Private torrent sites
• Overview of BitTorrent Darknets Operation• Analysis
§ Macroscopic § Medium-scopic § Microscopic
• Conclusion
SKKU VLDB Lab.
6
BitTorrent (Background)
• Bittorrent is a system for efficient and scalable replication of large amounts of static data§ Scalable - the throughput increases with the number of downloaders§ Efficient - it utilises a large amount of available network bandwidth
• The file to be distributed is split up in pieces and an SHA-1 hash is calculated for each piece
SKKU VLDB Lab.
7
BitTorrent (Background)
• A metadata file (.torrent) is distributed to all peers§ Usually via HTTP
• The metadata contains:§ The SHA-1 hashes of all pieces§ A mapping of the pieces to
files§ trackers reference
SKKU VLDB Lab.
8
BitTorrent (Background)
• The tracker is a central server keeping a list of all peers participating in the swarm
• A swarm is the set of peers that are participating in distributing the same files
• A peer joins a swarm by asking the tracker for a peer list and connects to those peers
SKKU VLDB Lab.
출처 : An introduction to the BitTorrent Peer-to-Peer File-Sharing System, J.A. Pouwelse et al.
9
BitTorrent (Background)
SKKU VLDB Lab.
§ Private vs Public§ Private flag set to 1
- DHT, PEX 활성화결정
10 SKKU VLDB Lab.
BitTorrent (Background)
11
Contents
• Introduction• BitTorrent (Background)
§ Architecture and Term.§ Public and Private torrent sites
• Overview of BitTorrent Darknets Operation• Analysis
§ Macroscopic § Medium-scopic § Microscopic
• Conclusion
SKKU VLDB Lab.
12
Overview of BitTorrent Darknets Operation
• Darknet owner§ web site and tracker
• User§ register web site and get “pass key”§ Invitation system§ Tracker Checker, BTRACS
• Incentive policies§ Ratio incentive§ Enforce minimum ratio
SKKU VLDB Lab.
13
Contents
• Introduction• BitTorrent (Background)
§ Architecture and Term.§ Public and Private torrent sites
• Overview of BitTorrent Darknets Operation• Analysis
§ Macroscopic § Medium-scopic § Microscopic
• Conclusion
SKKU VLDB Lab.
14
Analysis
• Macroscopic§ Rough idea about
- How many BitTorrent Darknets- How many files being shared- How many users participate- Where the Trackers are located- Where the users of the darknets are located
• Methodology§ Find darknets : sharky list§ Alexa rank§ crawler
SKKU VLDB Lab.
15
Analysis
• Sharky list§ 900+ darknets in June, 2009§ 963, today
• Create list§ Tracker checker websites§ File sharing blogs and forums§ Google search§ IRC invite channel
SKKU VLDB Lab.
16
Analysis
• In this paper§ manually checking only operational sites§ 863 private sites
• Category analysis§ 55% General
SKKU VLDB Lab.
17
Analysis
• Geographic distribution§ Using MaxMind GeoIP§ Europe(Leading Netherlands)
SKKU VLDB Lab.
18
Analysis
• Which site most popular?§ Using Alexa’s rank§ Alexa rank
- present usage statisitcs§ Pick 15 most popular darknets
- 6 of them locate in netherlands- 1 china
SKKU VLDB Lab.
19
Analysis
• Top site – Torrents.ru§ 612,000 torrents§ 3.5 million user account
• Usage by country§ where is the netherlands?
SKKU VLDB Lab.
20
Analysis
• Total estimation§ Regression analysis btw. Alexa rank and Darknets§ Obtained 33 private sites (out of 67 sites)§ Manually gather statistics from the sites (some of it partial stat.)§ # of torrent : 0.84§ # of account : 0.81§ # of peers : 0.89
SKKU VLDB Lab.
21
Analysis
• Total estimation§ Obtain correlation eq. (X is alexa rank)
§ yt = torrents§ ya = account§ yp = peers
• Aggregate total estimation using eq
SKKU VLDB Lab.
22
Analysis
• Privates vs Public§ Public : top 5 public torrent site (Mininova, Pirate Bay, Torrent Reactor,
Btmonster, and torrent portal)§ Collect
- 8.8 million .torrent files(4.6 million unique info hashes)- 38,996 trackers
§ Observe - 5,085,217 unique peers
• Summary§ Darknets
- Private world is comparable to the public site- 4.4 million torrent vs 4.6 info hashes
- Active peers larger than that of the public sites
SKKU VLDB Lab.
23
Analysis
• Medium-scopic§ 4 sites
- Torrents.ru, Zamunda, BitSoup, HDChina- Use only one tracker- From April 11, 2009 to June 13, 2009 crawling
- Zamunda, BitSoup, HDChina- Torrents.ru is private flag set to 0, Using DHT
- Active torrent : has at least one active peers
SKKU VLDB Lab.
24
Analysis
• Overlap with the Public ecosystem§ Infohash based
- has same infohash (SHA-1)§ Piece-based
- Because of private flag, different infohash- Alternative
- matching each pieces’ hash- Better than infohash matching system
SKKU VLDB Lab.
25
Analysis
• Overlap with the Public ecosystem§ Infohash based§ Piece-based§ Comparably low overlap ratio btw. each darknets
SKKU VLDB Lab.
26
Analysis
• Overlap btw. public sites§ more than 50 %
SKKU VLDB Lab.
27
Analysis
• Title match, extended match§ Title match
- Same file has same title- eg. Ghost Ship. HDDVD.1080p.DTS.x264-CtrlHD- Title, Source Media, Resolution, Codec, release team and so on
§ Extended match- Title match + the same file size(within 5%)- Same file but different hash set
- encode rate, different language§ Methodology
- top-100, random 100- do TM, EM check
SKKU VLDB Lab.
28
Analysis
• Leakage with the Public ecosystem§ IPs have leaked into a DHT
- If we know these IPs, don’t need to register private site§ Methodology
- Develop DHT crawler- crawl the DHT system for all the infohashes obtained from private
sites- Low leakage rate except torrents.ru
SKKU VLDB Lab.
29
Analysis
• Characteristics of private torrents§ Newly released torrents, attract more peer§ Decay of private sites much less
- Because of purging policy- Remove unpopular
SKKU VLDB Lab.
30
Analysis
• Characteristics of private torrents§ Average torrent age on private site smaller§ Because of purging policy
- Old, unpopular removed by administrator§ Rank
- have a longer tail
SKKU VLDB Lab.
31
Analysis
• Microscopic§ HDChina
- HD Movies and TV series- 18,054 user account- 15,738 active torrents- 10GB, 0.3 up/down ratio- 100GB, 0.7 up/down - pased for all user data in HDChina
SKKU VLDB Lab.
32
Analysis
• Microscopic§ Up/down rate
- incentive policy- Total up/down- 17,054 TB/2,568TB- Many users upload more than
1TB
SKKU VLDB Lab.
33
Analysis
• Microscopic§ Share rate (up/down)
- more than 90% ratio higher than 1- less than 5% higher than 100
SKKU VLDB Lab.
34
Analysis
• Microscopic§ Online time
- 50% users return within 10 hours- 95 % users return within 100 hours
SKKU VLDB Lab.
35
Contents
• Introduction• BitTorrent (Background)
§ Architecture and Term.§ Public and Private torrent sites
• Overview of BitTorrent Darknets Operation• Analysis
§ Macroscopic § Medium-scopic § Microscopic
• Conclusion
SKKU VLDB Lab.
36
Conclusion
• Investigate 800+ private torrent sites§ In terms of geographic concentrations and content distributions§ Using sharky’s list and alexa rank
- present informative view of darknets landscape- regression analysis
- give us estimation- Private torrent sites
- are relatively small but aggregation size of the darknets is large
• Popular torrent sites§ private sites vs. public sites§ Overlap, leakage
SKKU VLDB Lab.
37
QnA
• Any question?
SKKU VLDB Lab.