12
Arc-Segment Alignment for RNA Secondary Structure 指指指指 指指指 指指指指 指指指

Arc-Segment Alignment for RNA Secondary Structure

Embed Size (px)

DESCRIPTION

Arc-Segment Alignment for RNA Secondary Structure. 指導教授:楊昌彪 學生姓名:彭永興. The Longest Common Subsequence (LCS) Problem. A string : S 1 = “ TAGTCACG ” A subsequence of S 1 : deleting 0 or more symbols from S 1 (not necessarily consecutive). e.g. G , AGC , TATC , AGACG - PowerPoint PPT Presentation

Citation preview

Page 1: Arc-Segment Alignment for RNA Secondary Structure

Arc-Segment Alignment for RNA Secondary Structure

指導教授:楊昌彪學生姓名:彭永興

Page 2: Arc-Segment Alignment for RNA Secondary Structure

The Longest Common Subsequence (LCS) Problem

• A string : S1 = “TAGTCACG”• A subsequence of S1 : deleting 0 or more symbols from S1 (not necessa

rily consecutive). e.g. G, AGC, TATC, AGACG• Common subsequences of S1 = “TAGTCACG” and S2 = “AGACTGTC” : GG, AGC, AGACG• Longest common subsequence (LCS) :• S1: TAGTCACG

S2: AGACTGTC LCS: AGACG

Page 3: Arc-Segment Alignment for RNA Secondary Structure

Sequence Alignment

S1 = TAGTCACG

S2 = AGACTGTC----TAGTCACG TAGTCAC-G--AGACT-GTC--- -AG--ACTGTC

• Which one is better?• We can set different gap penalties as parameters for

different purposes.

Page 4: Arc-Segment Alignment for RNA Secondary Structure

After matrix A has been found, we can trace back to find the LCS.

TAGTCACGAGACTGTCLCS:AGACG

- A G A C T G T C

0 0 0 0 0 0 0 0 0-

0 0 0 0 0 1 1 1 1T

0 1 1 1 1 1 1 1 1A

0 1 1 1 2 2 2 2 2G

0 1 1 1 2 3 3 3 3T

0 1 2 2 2 3 4 4 4C

0 1 2 3 3 3 4 4 4A

0 1 2 3 4 4 4 4 5C

0 1 2 3 4 4 5 5 5G

Page 5: Arc-Segment Alignment for RNA Secondary Structure

The Structure of RNA

Page 6: Arc-Segment Alignment for RNA Secondary Structure

Arc Annotation for RNA Secondary Structure

Page 7: Arc-Segment Alignment for RNA Secondary Structure

How to Compare two RNA Secondary Structure

• Longest Arc-Preserving Common Subsequence

O(n5) for LAPCS(nested, nested)LAPCS(crossing, crossing) is NP-Hard

• Arc-Segment Alignment (Our Method)

O(n2) for ASA(nested, nested)

ASA(crossing,crossing) may be solved in polynomial time

Page 8: Arc-Segment Alignment for RNA Secondary Structure

Our Comparison Algorithm

(1)Given two RNA 2nd structure S1,S2 with length m and n, find the “Sequence of Arc segment” A1 from S1, A2 from S2

(2)Solve the Alignment for A1,A2 using the Arc-segment alignment

(3)From the answer, we known how to deal with the arc parts, then we know how to deal with the other parts of the RNA sequence

Page 9: Arc-Segment Alignment for RNA Secondary Structure

Arc-Segment Alignment

• ASA checks “if the segment match”, not like original LCS which checks if the character match. Therefore, we need a threshold to define what the “match” means

• To check if two segments are matched Arc Size + Arc location + Sub-ASA(recursive)

• ASA would perform simple sequence alignment if one of the RNA sequence does not contain any arcs

Page 10: Arc-Segment Alignment for RNA Secondary Structure

Example for ASA(nested, nested) part1

G

TGA

TA A

Page 11: Arc-Segment Alignment for RNA Secondary Structure

Example for ASA(nested, nested) part2

A

AT

T

1 32

321

Perform Original Sequence Alignment for 1 2 3 segments

Page 12: Arc-Segment Alignment for RNA Secondary Structure

Advantage of ASA

• Time complexity is only O(n2) if we want to solve nested-nested comparison

• It emphasizes on the arcs, so it can reflect more structure similarity than LAPCS

• It may solve crossing-crossing comparison in polynomial time if being correctly modified

• It is reflexible because we can set different threshold and different weight for score factor