Version Space using DNA Computing

1

Version Space using DNA Computing

2001.10.26임희웅

2

Version Space(1) Version Space?

Concept Learning Classifying given instance x Maintain a set of hypothesis that is consistent with the

training examples Instance X

described by the tuple of attributes Attributes

Dept, {ee, cs} Status,{faculty, staff} Floor,{four, five}

3

Version Space(2) Hypotheses H

Each hypothesis is described by a conjunction of constraints on the attributes

Ex) <cs, faculty> or <cs> Target concept

X {0, 1} Training example D

<cs, faculty, four> + <cs, faculty, five> + <ee, faculty, four> - <cs, staff, five> -

4

Version Space(3) Hypothesis that is consistent with

training example all combination of the attributes of the training example i.e. power set Training example : <cs, faculty, four> Consistent Hypothesis

<cs, faculty, four>,<cs, faculty>, <cs, four>, <faculty, four><cs>, <faculty>, <four>

5

Version Space(4)

ee faculty staff four five

cs ∧ faculty ee ∧ facultycs ∧ staff ee ∧ staff faculty ∧ fourfaculty ∧ five

cs ∧ staff ∧ five ee ∧ faculty ∧ four cs ∧ faculty ∧ five cs ∧ faculty ∧ four

∧

cs

6




∧

cs

+

<cs, faculty, four>,<cs, faculty>, <cs, four>, <faculty, four><cs>, <faculty>, <four><>

7




∧

cs

+ –

<cs, faculty, four>,<cs, faculty>, <cs, four>, <faculty, four><faculty>, <four>

8




∧

cs

+ + –

<cs, faculty><faculty>

9




∧

cs

+ + – –

<cs, faculty>

10

Version Space using DNA Computing

Problem Definition Attributes

Dept,{ee, cs} Status,{faculty, staff} Floor,{four, five}

Training example D <cs, faculty, four> + <cs, faculty, five> + <ee, faculty, four> - <cs, staff, five> -

11

Encoding.1 Attribute 사이에 순서를 고려할 경우

각각의 attribute 의 값들을 하나의 기본 DNA sequence 로 표현하고 이러한 기본 DNA sequence 들을 서로 다른 attribute 에 속하는 sequence 들 끼리 ligation 될 수 있도록 sticky end 조건을 준다 .

이 경우 <cs, faculty> 나 <faculty, four> 와 같은 것은 생성되지만 <cs, four> 는 생성되지 않는다 .

StatusTACGT

FloorTTAAC

DeptATGCA AATTG

12

Encoding.2(1)

Attribute 들 간의 순서를 고려하지 않을 경우

Adleman 실험의 encoding 을 이용 Attribute value : vertex Ligation of Attribute value : edge Complete graph, Overhead

13

Encoding.2(2)

cs

staff five

faculty four

ee

Graph for All Hypothesis

cs-H cs-T

ee-H ee-T

faculty-H

faculty-T

staff-H staff-T

four-H four-T

five-H five-T

cs-T’ faculty-H’

cs-T’ staff-H’

cs-T’ five-H’

cs-T’ four-H’

ee-T’ faculty-H’

ee-T’ staff-H’

ee-T’ five-H’

ee-T’ four-H’

faculty-T’

four-H’

faculty-T’

five-H’

staff-T’ four-H’

staff-T’ five-H’

Vertex

Edge

+ Dummy strand for blunt end?

14

Encoding.2(3) Graph for Hypothesis that is consistent with

<cs, faculty, four>

cs

faculty four

cs-H cs-T

faculty-H

faculty-T

four-H four-T

cs-T’ faculty-H’

cs-T’ four-H’

faculty-T’

four-H’

Vertex Edge

+ Dummy strand for blunt end?

15

Encoding.2(4) 앞의 과정을 통해서 생성된 double strand 를

그대로 사용하지 않고 vertex 부분이 encoding 된 single strand 만을 추출하여 사용한다 . Single strand 를 추출하는 과정은 각각의 attribute

value 에 해당하는 strand 에 complementary 한 strand 를 사용해서 (bead?) 이루어 질 수 있다 .

Why only single strand? hybridization 을 이용한 교집합 연산을 위해

Hypothesis strand 에 존재하는 attribute에 대해서 순서를 주었음

16

Encoding.3(1) Bead 의 이용

앞의 Adleman 의 encoding 방법을 사용하는 것보다 훨씬 적은 수의 sequence 가 필요함

또한 가능한 모든 hypothesis 를 한꺼번에 생성할 수도 있고 특정한 example 에 대해서 consistent 한 모든 hypothesis 를 모두 생성할 수도 있음

BeadDept

Status

Floor + 각각의 attribute 에 해당하는 dummy sequence

17

Encoding.3(2) Problem!!

Hypothesis 들의 증폭은 어떻게 할 것인가 ?

18

Detection(1) – Encoding.2 1. Tube1(0)

first traning example(positive 라고 가정 ) 에 대해서 그와 consistent 한 모든 hypothesis

2. 다음 example 들에 대해 각각 다음의 작업을 수행한다 . Tube2

example 과 consistent 한 모든 hypothesis Positive 일 경우

Tube1(n+1) = Tube1(n) ∩ Tube2 Negative 일 경우

Tube1(n+1) = Tube1(n) - Tube2

19

Detection(2) – Encoding.2 Implementation of ‘∩’ operation

최초의 Tube1(0) 는 앞서 encoding.2 에서 제안한대로 생성한다 .(single strand)

이후의 example 에 대해 생성되는 모든 Tube2 는 tube1(0) 과 encoding 을 complementary하게한다 .(single strand)

Tube2 를 Tube1 에 넣어 hybridization 시켜 서로 완전히 결합한 double strand 만을 뽑아낸다 .

앞의 결과물을 denature 시켜 최초의 Tube1(0)과 같은 방향으로 encoding 된 한쪽 single strand 만을 다시 추출해낸다 .

20

Detection(3) – Encoding.2 Implementation of ‘-’ operation

How? =_=; Complement Set 과의 ‘∩’ operation? Another encoding for negative example?

21

Detection(4) – Encoding.2 Problem

Strand 들의 증폭이 필요하다면 ?

22

Detection – Encoding.3 Bead 를 이용한 encoding 의 경우

How? =_=; ‘∩’, ‘-’ 증폭

23

Application 실제 input 이 들어왔을 때 Classification 을

어떻게 할 것인가 ? Voting?

24

Reference on Version Space Machine Learning, T.M. Mitchell, McGraw

Hill Artificial Intelligence-Theory and Practice,

Dean, Addison-Wesley

Documents

Version Space using DNA Computing