of 21 /21
Longest Common Subsequence(LCS) 研究生 鍾聖彥 指導老師 許慶昇 Dynamic Programming 1 2014/05/07 最長共同子序列

# Dynamic programming lcs

• View
442

2

Tags:

Embed Size (px)

### Text of Dynamic programming lcs

Longest Common Subsequence(LCS)

Dynamic Programming

12014/05/07

Dynamic Programming

Optimal substructure(當一個問題存在著最佳解，則表示其所有子問題也必存在著最佳解)

Overlapping subproblems(子問題重複出現)

2

Longest Common Subsequence???

Biological applications often need to compare the DNA of tow(or more) different organisms.

3

Subsequence

A subsequence of a given sequence is just the given sequence with zero or more elements left out.

Ex: app、le、ple and so on are subsequences of “apple”.

4

Common Subsequence

X = (A, B, C, B, D, A, B)Y = (B, D, C, A, B, A)

Two sequences:

Sequence Z is a common subsequence of X and Y if Z is a subsequence of both X and Y

Z = (B, C, A) — length 3 Z = (B, C, A, B) - length 4 Z = (B, D, A, B) — length 4

Z= — length 5 ???

longest

5

What is longest Common Subsequence problem?

X = (x1, x2,……., xm) Y = (y1, y2,……., yn)

6

Find a maximum-length common subsequence of X and Y

How to do?Dynamic Programming!!!

Brute Force!!!

Step 1: Characterize optimality

Sequence X = (x1, x2,……., xm)

Define the ith prefix of X, for i = 0, 1,…, m as Xi = (x1, x2, ..., xi)

with X0 representing the empty sequence.

EX: if X = (A, B, C, A, D, A, B) then X4 = (A, B, C, A)X0 = ( ) empty sequence

7

Theorem (Optimal substructure of LCS)

8

1. If Xm = Yn, then Zk = Xm = Yn and Zk-1 is a LCS of Xm-1 and Yn-1

2. If Xm ≠ Yn, then Zk ≠ Xm implies that Z is a LCS of Xm-1 and Y

3. If Xm ≠ Yn, then Zk ≠ Yn implies that Z is a LCS of X and Yn-1

X = (X1, X2,…, Xm) and Y = (Y1, Y2,…, Yn)

Sequences

Z = (Z1, Z2,…, Zk) be any LCS of X and Y

We assume:

Optimal substructure problem

The LCS of the original two sequences contains a LCS of prefixes of the two sequences.

(當一個問題存在著最佳解，則表示其所有子問題也必存在著最佳解)

9

Step 2: A recursive solutionXi and Yj end with xi=yj

Zk is Zk -1 followed by Zk = Xi = Yj where Zk-1 is an LCS of Xi-1 and Yj -1

LenLCS(i, j) = LenLCS(i-1, j-1)+1

Xi x1 x2 … xi-1 xi

Yj y1 y2 … yj-1 yj=xi

Zk z1 z2…zk-1 zk =yj=xi

Case 1:

Step 2: A recursive solutionCase 2,3: Xi and Yj end with xi ≠ yj

Xi x1 x2 … xi-1 xi

Yj y1 y2 … yj-1 yj

Zk z1 z2…zk-1 zk ≠yj

Xi x1 x2 … xi-1 x i

Yj yj y1 y2 …yj-1 yj

Zk z1 z2…zk-1 zk ≠ xi

Zk is an LCS of Xi and Yj -1 Zk is an LCS of Xi-1 and Yj

LenLCS(i, j)=max{LenLCS(i, j-1), LenLCS(i-1, j)}

Step 2:A recursive solution

Let c[i,j] be the length of a LCS for Xi and Yj the recursion described by the above cases as

12

Case 1 Reduces to the single subproblem of finding a LCS of

Xm-1, Yn-1 and adding Xm = Yn to the end of Z.

Cases 2 and 3 Reduces to two subproblems of finding a LCS of Xm-1, Y and X, Yn-1 and selecting the longer of the two.

Step 3: Compute the length of the LCS

LCS problem has only ɵ(mn) distinct subproblems.

So? Use Dynamic programming!!!

13

Step 3: Compute the length of the LCSProcedure 1

LCS-length takes two Sequences X = (x1, x2,…, xm) and Y = (y1, y2,…, yn) as input.

Procedure 2It stores the c[i, j] values in a table c[0..m, 0..n] and

it computes the entries in row-major order.Procedure 3

Table b[1..m, 1..n] to construct an optimal solution. b[i, j] points to the table entry corresponding to the

optimal solution chosen when computing c[i, j]Procedure 4

Return the b and c tables; c[m, n] contains the length of an LCS X and Y

14

LCS-Length(X, Y)1 m = X.length 2 n = Y.length 3 let b[1..m, 1..n] and c[0..m, 0..n] be new tables. 4 for i 1 to m do 5 c[i, 0] = 0 6 for j 1 to n do 7 c[0, j] = 0 8 for i 1 to m do 9 for j 1 to n do 10 if xi ==yj11 c[i, j] = c[i-1, j-1]+112 b[i, j] = “ ” 13 else if c[i-1, j] ≥ c[i, j-1]14 c[i, j] = c[i-1, j] 15 b[i, j] = “ ”16 else 17 c[i, j] = c[i, j-1]18 b[i, j] = “ ” 19 return c and b 15

The table produced by LCS-Length on the sequences X = (A, B, C, B, D, A, B) and Y = (B, D, C, A, B, A).

16

The running time of the procedure is O(mn), since each table entry table O(1) time to compute

Step 4: Construct an optimal LCS

PRINT-LCS(b, X, i, j)PRINT-LCS(b, X, X.length, Y.length)

1 if i == 0 or j == 0 2 return 3 if b[i, j] == “ ” 4 PRINT-LCS(b,X,i-1, j-1) 5 print Xi 6 else if b[i, j] == “ ” 7 PRINT-LCS(b,X,i-1, j) 8 else PRINT-LCS(b,X,i, j-1)

This procedure prints BCBA. The procedure takes time O(m+n)

ExampleX = <A, B, C, B, A> Y = <B, D, C, A, B>

We will fill in the table in row-major order starting in the upper left corner using the following formulas:

ExampleX = <A, B, C, B, A> Y = <B, D, C, A, B>

We will fill in the table in row-major order starting in the upper left corner using the following formulas:

Thus the optimal LCS length is c[m,n] = 3.Optimal LCS starting at c[5,5] we get Z = <B, C, B>

Alternatively start at c[5,4] we would produce Z = <B, C, A>.

*Note that the LCS is not unique but the optimal length of the LCS is.

20

ReferenceLecture 13: Dynamic Programming - Longest Common Subsequence http://faculty.ycp.edu/~dbabcock/cs360/lectures/lecture13.html

http://www.csie.ntnu.edu.tw/~u91029/LongestCommonSubsequence.html

Longest common subsequence (Cormen et al., Sec. 15.4)

Documents
Documents
Documents
Engineering
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Technology
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Education
Documents
Documents
Documents
Documents
Education
Documents