Upload
brenda-warren
View
36
Download
0
Embed Size (px)
DESCRIPTION
A P2P flow Identification Model Based On Bayesian Network. Published in : Wireless Communications, Networking and Mobile Computing (WiCOM), 2011 7th International Conference on Date of Conference: 23-25 Sept. 2011. 102062626 黃柏勛 資工碩一. - PowerPoint PPT Presentation
Citation preview
102062626 黃柏勛 資工碩一
A P2P flow Identification Model Based On Bayesian
NetworkPublished in:Wireless Communications, Networking and Mobile Computing (WiCOM), 2011 7th International Conference on
Date of Conference: 23-25 Sept. 2011
1/31
Abstract❖ 1.Constitute A uniform P2P flow identification model. –
UFIM(Uniform Flow Identification Model)
❖ 2. An idea to describe UFIM abstractly utilizing Bayesian network model is advanced.
❖ 3. We make 6 measurements to denote identification performance.
❖ 4. The contrasting result in theory analysis and experiments shows that UFIM can denote various type of P2P flow identification method abstractly.
❖ 5. . All these works establish the base of giving new identification method further.
2/31
Introduction❖ P2P flows could be sorted to 4 classes:
1.Port identification 2. application layer characteristic word identification 3.transport layer heuristic identification 4.machine learning identification
❖ Erman et al. utilized two datasets, and contrasts 3 unsupervised clustering algo.:K-means, DBSCAN, and AutoClass .
He contrasted the accuracy, time-consuming, but without processing rate, real-time, CPU and memory consuming.
❖ Though many P2P flow identification method exist, but we
are lack of detailed contrasting and analyzing of different
identification method. 3/31
Introduction❖ This essay gives a UFIM (Uniform Flow Identification
Model) to describe different P2P flow identification method and give a theory of abstractly describing UFIM using Bayesian Network.
❖ And group the current flow identification characteristic to two categories: “basic characteristic” and “statistical characteristic” to decreasing the implementation complexity. => A Bayesian network model method to construct specific identification access.
❖ And give 6 measurements to analyze identification method.
4/31
II. P2P FLOW IDENTIFICATION MODEL
❖ Current p2p flow identification methods are different in
implementation access but have same essential characteristic—set mapping.
❖ Supposing that flows denotes the identified and classified flow sets, Y denotes the known application protocol set, then arbitrary identification method
could be denoted as F : flows →Y , namely the mapping
form flow set flows to application protocol set Y.
5/31
II. P2P FLOW IDENTIFICATION MODEL
UFIM consists of 3 part mainly:
(1) Characteristic set X = {A1 A2.. Am } Ai is random variable and denotes the flow identification characteristic
(2) Application protocol set Y = {y1 y2 ...yn } yi is an arbitrary vector, and m denotes m random variables
corresponding to X and identify different application protocols;
(3) Mapping function F. for a given flow i flow , F could
judge the belonged application protocol of yk ,
7/31
II. P2P FLOW IDENTIFICATION MODEL
We could take out a flow record (1)flow i from flows set and (2)construct a value vector a = {a1^0 , a2^0 , ..., am^0 } , which is corresponding to m characteristics in X. Then we (3)contrast a with n vectors in Y, and (4)output the application protocol yk , which has the highest similarity, as result.
9/31
II. P2P FLOW IDENTIFICATION MODEL
The accuracy UFIM A of UFIM is decided by 3 parts:
① set Y, this set is related with application protocol
classification and the accuracy of vector and denoted as (1)Ay ;
② (2)Aflow , it is related to accuracy of characteristic value a when construct unknown flow flowi ;
③(3)Af , it is related to the accuracy of mapping function F.
10/31
Abstract❖ 1.Constitute A uniform P2P flow identification model. –
UFIM(Uniform Flow Identification Model)
❖ 2. An idea to describe UFIM abstractly utilizing Bayesian network model is advanced.
❖ 3. We make 6 measurements to denote identification performance.
❖ 4. The contrasting result in theory analysis and experiments shows that UFIM can denote various type of P2P flow identification method abstractly.
❖ 5. . All these works establish the base of giving new identification method further. 11/31
III. Bayesian network description
Back ground
1.Baysian Network
2.Basic and Statistical characteristic
(Characteristic selection is important for identification.)
13/31
IIII.Bayesian network description
--Basic characteristics represents the characteristics that can be extracted directly from a single block, denoted by Ai^0 , the basic characteristic set is denoted as
--Statistical characteristics represents the characteristics that can be extracted from basic characteristics of multiple messages, denoted as Ai^j , where i represents the basic characteristics of Ai .
16/31
IIII.Bayesian network description
Through studying different existing identification methods and the 248 kinds of characteristics mentioned in literature [10], we selected 7 basic characteristics, as TableⅠ shows.
17/31
Abstract❖ 1.Constitute A uniform P2P flow identification model. –
UFIM(Uniform Flow Identification Model)
❖ 2. An idea to describe UFIM abstractly utilizing Bayesian network model is advanced.
❖ 3. We make 6 measurements to denote identification performance.
❖ 4. The contrasting result in theory analysis and experiments shows that UFIM can denote various type of P2P flow identification method abstractly.
❖ 5. . All these works establish the base of giving new identification method further. 11/31
IV. Performance measurements
Def .1: flow identification rate T: it denotes the maximum
packets needed for flow identification, that f constructs all the captured packets for identification characteristics
Because ni denotes the packet quantity needed for constructing Xi .
20/31
IV. Performance measurements
Def. 2: protocol distinguishing rate I: suggest that f
distinguish the belonged application protocol of flow with
conditional probability , then I denotes the
probability that packets were mis-distinguished, that is the proportion of misidentified flow in total flows.
21/31
IV. Performance measurements
Def 3: characteristic offset W: packets belongs to χ is
regarded as unknown flow, then W denotes the proportion of unknown flow in total flows.
Def 4: identification robustness H: it denotes whether the
correctness of f is correlated with the packet arriving order.
22/31
IV. Performance measurements
Def 5: flow identification consuming L:it denotes the time
needed for flow identification and equals to the time
complexity of f.
Def 6: flow identification space S: it denotes the memory
space of f needed for identifying flow and equals to the time complexity of f.
23/31
IV. Performance measurements
T reflects the real-time of f,
I and W reflect the correctness of f,
H reflects robustness of function f,
L and S reflect the complexity of f.
24/31
Abstract❖ 1.Constitute A uniform P2P flow identification model. –
UFIM(Uniform Flow Identification Model)
❖ 2. An idea to describe UFIM abstractly utilizing Bayesian network model is advanced.
❖ 3. We make 6 measurements to denote identification performance.
❖ 4. The contrasting result in theory analysis and experiments shows that UFIM can denote various type of P2P flow identification method abstractly.
❖ 5. . All these works establish the base of giving new identification method further. 11/31
V.Experiment Analysis-F denotes the number of error identification P2P flows, include false negative and false positive.
-U denotes as unknown flow
[A,B] denotes a set of testing data
25/31
V.Experiment Analysis-I and W denote protocol erroneous judgment and characteristic offset.
[A,B] denotes a set of testing data
26/31
V.Experiment AnalysisFrom the identification result we could conclude that the
proportion of F and U in P2P traffic is same to I and W. so I and W could be used to denote the identification accuracy of I and W.
27/31
Abstract❖ 1.Constitute A uniform P2P flow identification model. –
UFIM(Uniform Flow Identification Model)
❖ 2. An idea to describe UFIM abstractly utilizing Bayesian network model is advanced.
❖ 3. We make 6 measurements to denote identification performance.
❖ 4. The contrasting result in theory analysis and experiments shows that UFIM can denote various type of P2P flow identification method abstractly.
❖ 5. . All these works establish the base of giving new identification method further. 11/31
VI. Summary and My report
- we could analyze and compare performance of different identification method in uniform model.
28/31
VI. Summary and My report
- we could analyze and compare performance of different identification method in uniform model.
- Math is important.
29/31