Upload
storm
View
57
Download
1
Embed Size (px)
DESCRIPTION
Version Space using DNA Computing. 2001.10.26 임희웅. Version Space(1). Version Space? Concept Learning Classifying given instance x Maintain a set of hypothesis that is consistent with the training examples Instance X described by the tuple of attributes Attributes Dept, {ee, cs} - PowerPoint PPT Presentation
Citation preview
1
Version Space using DNA Computing
2001.10.26임희웅
2
Version Space(1) Version Space?
Concept Learning Classifying given instance x Maintain a set of hypothesis that is consistent with the
training examples Instance X
described by the tuple of attributes Attributes
Dept, {ee, cs} Status,{faculty, staff} Floor,{four, five}
3
Version Space(2) Hypotheses H
Each hypothesis is described by a conjunction of constraints on the attributes
Ex) <cs, faculty> or <cs> Target concept
X {0, 1} Training example D
<cs, faculty, four> + <cs, faculty, five> + <ee, faculty, four> - <cs, staff, five> -
4
Version Space(3) Hypothesis that is consistent with
training example all combination of the attributes of the training example i.e. power set Training example : <cs, faculty, four> Consistent Hypothesis
<cs, faculty, four>,<cs, faculty>, <cs, four>, <faculty, four><cs>, <faculty>, <four>
5
Version Space(4)
ee faculty staff four five
cs ∧ faculty ee ∧ facultycs ∧ staff ee ∧ staff faculty ∧ fourfaculty ∧ five
cs ∧ staff ∧ five ee ∧ faculty ∧ four cs ∧ faculty ∧ five cs ∧ faculty ∧ four
∧
cs
6
ee faculty staff four five
cs ∧ faculty ee ∧ facultycs ∧ staff ee ∧ staff faculty ∧ fourfaculty ∧ five
cs ∧ staff ∧ five ee ∧ faculty ∧ four cs ∧ faculty ∧ five cs ∧ faculty ∧ four
∧
cs
+
<cs, faculty, four>,<cs, faculty>, <cs, four>, <faculty, four><cs>, <faculty>, <four><>
7
ee faculty staff four five
cs ∧ faculty ee ∧ facultycs ∧ staff ee ∧ staff faculty ∧ fourfaculty ∧ five
cs ∧ staff ∧ five ee ∧ faculty ∧ four cs ∧ faculty ∧ five cs ∧ faculty ∧ four
∧
cs
+ –
<cs, faculty, four>,<cs, faculty>, <cs, four>, <faculty, four><faculty>, <four>
8
ee faculty staff four five
cs ∧ faculty ee ∧ facultycs ∧ staff ee ∧ staff faculty ∧ fourfaculty ∧ five
cs ∧ staff ∧ five ee ∧ faculty ∧ four cs ∧ faculty ∧ five cs ∧ faculty ∧ four
∧
cs
+ + –
<cs, faculty><faculty>
9
ee faculty staff four five
cs ∧ faculty ee ∧ facultycs ∧ staff ee ∧ staff faculty ∧ fourfaculty ∧ five
cs ∧ staff ∧ five ee ∧ faculty ∧ four cs ∧ faculty ∧ five cs ∧ faculty ∧ four
∧
cs
+ + – –
<cs, faculty>
10
Version Space using DNA Computing
Problem Definition Attributes
Dept,{ee, cs} Status,{faculty, staff} Floor,{four, five}
Training example D <cs, faculty, four> + <cs, faculty, five> + <ee, faculty, four> - <cs, staff, five> -
11
Encoding.1 Attribute 사이에 순서를 고려할 경우
각각의 attribute 의 값들을 하나의 기본 DNA sequence 로 표현하고 이러한 기본 DNA sequence 들을 서로 다른 attribute 에 속하는 sequence 들 끼리 ligation 될 수 있도록 sticky end 조건을 준다 .
이 경우 <cs, faculty> 나 <faculty, four> 와 같은 것은 생성되지만 <cs, four> 는 생성되지 않는다 .
StatusTACGT
FloorTTAAC
DeptATGCA AATTG
12
Encoding.2(1)
Attribute 들 간의 순서를 고려하지 않을 경우
Adleman 실험의 encoding 을 이용 Attribute value : vertex Ligation of Attribute value : edge Complete graph, Overhead
13
Encoding.2(2)
cs
staff five
faculty four
ee
Graph for All Hypothesis
cs-H cs-T
ee-H ee-T
faculty-H
faculty-T
staff-H staff-T
four-H four-T
five-H five-T
cs-T’ faculty-H’
cs-T’ staff-H’
cs-T’ five-H’
cs-T’ four-H’
ee-T’ faculty-H’
ee-T’ staff-H’
ee-T’ five-H’
ee-T’ four-H’
faculty-T’
four-H’
faculty-T’
five-H’
staff-T’ four-H’
staff-T’ five-H’
Vertex
Edge
+ Dummy strand for blunt end?
14
Encoding.2(3) Graph for Hypothesis that is consistent with
<cs, faculty, four>
cs
faculty four
cs-H cs-T
faculty-H
faculty-T
four-H four-T
cs-T’ faculty-H’
cs-T’ four-H’
faculty-T’
four-H’
Vertex Edge
+ Dummy strand for blunt end?
15
Encoding.2(4) 앞의 과정을 통해서 생성된 double strand 를
그대로 사용하지 않고 vertex 부분이 encoding 된 single strand 만을 추출하여 사용한다 . Single strand 를 추출하는 과정은 각각의 attribute
value 에 해당하는 strand 에 complementary 한 strand 를 사용해서 (bead?) 이루어 질 수 있다 .
Why only single strand? hybridization 을 이용한 교집합 연산을 위해
Hypothesis strand 에 존재하는 attribute에 대해서 순서를 주었음
16
Encoding.3(1) Bead 의 이용
앞의 Adleman 의 encoding 방법을 사용하는 것보다 훨씬 적은 수의 sequence 가 필요함
또한 가능한 모든 hypothesis 를 한꺼번에 생성할 수도 있고 특정한 example 에 대해서 consistent 한 모든 hypothesis 를 모두 생성할 수도 있음
BeadDept
Status
Floor + 각각의 attribute 에 해당하는 dummy sequence
17
Encoding.3(2) Problem!!
Hypothesis 들의 증폭은 어떻게 할 것인가 ?
18
Detection(1) – Encoding.2 1. Tube1(0)
first traning example(positive 라고 가정 ) 에 대해서 그와 consistent 한 모든 hypothesis
2. 다음 example 들에 대해 각각 다음의 작업을 수행한다 . Tube2
example 과 consistent 한 모든 hypothesis Positive 일 경우
Tube1(n+1) = Tube1(n) ∩ Tube2 Negative 일 경우
Tube1(n+1) = Tube1(n) - Tube2
19
Detection(2) – Encoding.2 Implementation of ‘∩’ operation
최초의 Tube1(0) 는 앞서 encoding.2 에서 제안한대로 생성한다 .(single strand)
이후의 example 에 대해 생성되는 모든 Tube2 는 tube1(0) 과 encoding 을 complementary하게한다 .(single strand)
Tube2 를 Tube1 에 넣어 hybridization 시켜 서로 완전히 결합한 double strand 만을 뽑아낸다 .
앞의 결과물을 denature 시켜 최초의 Tube1(0)과 같은 방향으로 encoding 된 한쪽 single strand 만을 다시 추출해낸다 .
20
Detection(3) – Encoding.2 Implementation of ‘-’ operation
How? =_=; Complement Set 과의 ‘∩’ operation? Another encoding for negative example?
21
Detection(4) – Encoding.2 Problem
Strand 들의 증폭이 필요하다면 ?
22
Detection – Encoding.3 Bead 를 이용한 encoding 의 경우
How? =_=; ‘∩’, ‘-’ 증폭
23
Application 실제 input 이 들어왔을 때 Classification 을
어떻게 할 것인가 ? Voting?
24
Reference on Version Space Machine Learning, T.M. Mitchell, McGraw
Hill Artificial Intelligence-Theory and Practice,
Dean, Addison-Wesley