- Home
- Data & Analytics
*Dynamic programming lcs*

of 21/21

Longest Common Subsequence(LCS) 研究生 鍾聖彥 指導老師 許慶昇 Dynamic Programming 1 2014/05/07 最長共同子序列

View

438Download

2

Embed Size (px)

Longest Common Subsequence(LCS) Dynamic Programming 1 2014/05/07
Dynamic Programming Optimal substructure( ) Overlapping
subproblems( ) 2 Longest Common Subsequence??? Biological
applications often need to compare the DNA of tow(or more)
different organisms. 3 Subsequence A subsequence of a given
sequence is just the given sequence with zero or more elements left
out. Ex: appleple and so on are subsequences of apple. 4 Common
Subsequence X = (A, B, C, B, D, A, B) Y = (B, D, C, A, B, A) Two
sequences: Sequence Z is a common subsequence of X and Y if Z is a
subsequence of both X and Y Z = (B, C, A) length 3 Z = (B, C, A, B)
- length 4 Z = (B, D, A, B) length 4 Z= length 5 ??? longest 5 What
is longest Common Subsequence problem? X = (x1, x2,., xm) Y = (y1,
y2,., yn) 6 Find a maximum-length common subsequence of X and Y How
to do? Dynamic Programming!!! Brute Force!!! Step 1: Characterize
optimality Sequence X = (x1, x2,., xm) Define the ith prefix of X,
for i = 0, 1,, m as Xi = (x1, x2, ..., xi) with X0 representing the
empty sequence. EX: if X = (A, B, C, A, D, A, B) then X4 = (A, B,
C, A) X0 = ( ) empty sequence 7 Theorem (Optimal substructure of
LCS) 8 1. If Xm = Yn, then Zk = Xm = Yn and Zk-1 is a LCS of Xm-1
and Yn-1 2. If Xm Yn, then Zk Xm implies that Z is a LCS of Xm-1
and Y 3. If Xm Yn, then Zk Yn implies that Z is a LCS of X and Yn-1
X = (X1, X2,, Xm) and Y = (Y1, Y2,, Yn) Sequences Z = (Z1, Z2,, Zk)
be any LCS of X and Y We assume: Optimal substructure problem The
LCS of the original two sequences contains a LCS of prefixes of the
two sequences. ( ) 9 Step 2: A recursive solution Xi and Yj end
with xi=yj Zk is Zk -1 followed by Zk = Xi = Yj where Zk-1 is an
LCS of Xi-1 and Yj -1 LenLCS(i, j) = LenLCS(i-1, j-1)+1 Xi x1 x2
xi-1 xi Yj y1 y2 yj-1 yj=xi Zk z1 z2zk-1 zk =yj=xi Case 1: Step 2:
A recursive solution Case 2,3: Xi and Yj end with xi yj Xi x1 x2
xi-1 xi Yj y1 y2 yj-1 yj Zk z1 z2zk-1 zk yj Xi x1 x2 xi-1 x i Yj yj
y1 y2 yj-1 yj Zk z1 z2zk-1 zk xi Zk is an LCS of Xi and Yj -1 Zk is
an LCS of Xi-1 and Yj LenLCS(i, j)=max{LenLCS(i, j-1), LenLCS(i-1,
j)} Step 2:A recursive solution Let c[i,j] be the length of a LCS
for Xi and Yj the recursion described by the above cases as 12 Case
1 Reduces to the single subproblem of finding a LCS of Xm-1, Yn-1
and adding Xm = Yn to the end of Z. Cases 2 and 3 Reduces to two
subproblems of finding a LCS of Xm-1, Y and X, Yn-1 and selecting
the longer of the two. Step 3: Compute the length of the LCS LCS
problem has only (mn) distinct subproblems. So? Use Dynamic
programming!!! 13 Step 3: Compute the length of the LCS Procedure 1
LCS-length takes two Sequences X = (x1, x2,, xm) and Y = (y1, y2,,
yn) as input. Procedure 2 It stores the c[i, j] values in a table
c[0..m, 0..n] and it computes the entries in row-major order.
Procedure 3 Table b[1..m, 1..n] to construct an optimal solution.
b[i, j] points to the table entry corresponding to the optimal
solution chosen when computing c[i, j] Procedure 4 Return the b and
c tables; c[m, n] contains the length of an LCS X and Y14
LCS-Length(X, Y) 1 m = X.length 2 n = Y.length 3 let b[1..m, 1..n]
and c[0..m, 0..n] be new tables. 4 for i 1 to m do 5 c[i, 0] = 0 6
for j 1 to n do 7 c[0, j] = 0 8 for i 1 to m do 9 for j 1 to n do
10 if xi ==yj 11 c[i, j] = c[i-1, j-1]+1 12 b[i, j] = 13 else if
c[i-1, j] c[i, j-1] 14 c[i, j] = c[i-1, j] 15 b[i, j] = 16 else 17
c[i, j] = c[i, j-1] 18 b[i, j] = 19 return c and b 15 The table
produced by LCS-Length on the sequences X = (A, B, C, B, D, A, B)
and Y = (B, D, C, A, B, A). 16 The running time of the procedure is
O(mn), since each table entry table O(1) time to compute Step 4:
Construct an optimal LCS PRINT-LCS(b, X, i, j) PRINT-LCS(b, X,
X.length, Y.length) 1 if i == 0 or j == 0 2 return 3 if b[i, j] ==
4 PRINT-LCS(b,X,i-1, j-1) 5 print Xi 6 else if b[i, j] == 7
PRINT-LCS(b,X,i-1, j) 8 else PRINT-LCS(b,X,i, j-1) This procedure
prints BCBA. The procedure takes time O(m+n) Example X = Y = We
will fill in the table in row-major order starting in the upper
left corner using the following formulas: Example X = Y = We will
fill in the table in row-major order starting in the upper left
corner using the following formulas: Answer Thus the optimal LCS
length is c[m,n] = 3. Optimal LCS starting at c[5,5] we get Z =
Alternatively start at c[5,4] we would produce Z = . *Note that the
LCS is not unique but the optimal length of the LCS is. 20
Reference Lecture 13: Dynamic Programming - Longest Common
Subsequence http://faculty.ycp.edu/
~dbabcock/cs360/lectures/lecture13.html
http://www.csie.ntnu.edu.tw/~u91029/ LongestCommonSubsequence.html
Longest common subsequence (Cormen et al., Sec. 15.4)
https://www.youtube.com/watch?v=Wv1y45iqsbk
https://www.youtube.com/watch?v=wJ-rP9hJXO0