43
RNA Bioinformatics· S. Will · 1 A Brief Overview of RNA Bioinformatics Sebastian Will University of Vienna Freiburg sRNA Meeting 2019

A Brief Overview of RNA Bioinformatics

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

RN

AB

ioin

form

ati

cs·

S.

Wil

1

A Brief Overview of RNA Bioinformatics

Sebastian WillUniversity of Vienna

Freiburg sRNA Meeting 2019

RN

AB

ioin

form

ati

cs·

S.

Wil

2

Same Goal: Functions and Role of RNAs

RN

AB

ioin

form

ati

cs·

S.

Wil

3

Different Methods

RN

AB

ioin

form

ati

cs·

S.

Wil

3

Different Methods

RN

AB

ioin

form

ati

cs·

S.

Wil

4

The Central Dogma(of RNA Bioinformatics)

Sequence =⇒ Structure =⇒ Function

Structure as Proxy of Function

RN

AB

ioin

form

ati

cs·

S.

Wil

4

The Central Dogma(of RNA Bioinformatics)

Sequence =⇒ Structure =⇒ Function

Structure as Proxy of Function

RN

AB

ioin

form

ati

cs·

S.

Wil

5

RNA Structure is Emergent

GCGGGAAUA

GCUCAGU

UG

G U AG A G C

AC

GA

CC

UU

GC C

AAGGUCGGGGU

CG C G A G

U U CG

AGUCUCGU

UUCCCGC

UC

CA

GCGGGUAUA

GCUCAGU

UG

G U AG A G C

A C GA C CUU G C

C A AG G

U C G G G GU CG C G A G

U U CG

AGUCUCGU

UUCCCGCUCC

A

GUGGUAAUA

GCUCAGU

UG

G U AG A G C

AC

GA

UC

UU

GC C

AAGGUCGGGGU

CG C C A G

U U CG

AGUCUGGU

UUACCGC

UC

CA

inconsistent

consistent

compensatory

Almost identical sequences — very different structuresVery different sequences — same structure

RN

AB

ioin

form

ati

cs·

S.

Wil

5

RNA Structure is Emergent

GCGGGAAUA

GCUCAGU

UG

G U AG A G C

AC

GA

CC

UU

GC C

AAGGUCGGGGU

CG C G A G

U U CG

AGUCUCGU

UUCCCGC

UC

CA

GCGGGUAUA

GCUCAGU

UG

G U AG A G C

A C GA C CUU G C

C A AG G

U C G G G GU CG C G A G

U U CG

AGUCUCGU

UUCCCGCUCC

A

GUGGUAAUA

GCUCAGU

UG

G U AG A G C

AC

GA

UC

UU

GC C

AAGGUCGGGGU

CG C C A G

U U CG

AGUCUGGU

UUACCGC

UC

CA

inconsistent

consistent

compensatory

No Shortcut “Sequence =⇒ Function”

RN

AB

ioin

form

ati

cs·

S.

Wil

6

More Dogmas

The world is simple!∗

∗in first approximation

Viable Shortcut

Sequence =⇒ 2D Structure =⇒ Function

RN

AB

ioin

form

ati

cs·

S.

Wil

6

More Dogmas

The world is simple!∗

∗in first approximation

Viable Shortcut

Sequence =⇒ 2D Structure =⇒ Function

RN

AB

ioin

form

ati

cs·

S.

Wil

6

More Dogmas

The world is simple!∗

∗in first approximation

GCGGAUUUA

GCUCAGD

DG

G G AG A G C

GCCAGAC

UG A A

YAU

CUGGAGGU

CC U G U GT P C

GAUC

CACAGAAUUCGCA C C A

D-LoopT-Loop

Acceptor Stem GCGGAUU

UA

GCUCA

GDDGG

GA G

AGCGCCAGAC

UG A A

YAU

CUGGA G

GUC

CUGUGTPC

GA U C

C A C A G A A U U C G C A C C A

Viable ShortcutSequence =⇒ 2D Structure =⇒ Function

RN

AB

ioin

form

ati

cs·

S.

Wil

7

Energies of RNA structures can be calculated

Nearest Neighbor Model (NNM)

• Free energies = sum of loop energies

• Loop energies measured experimentally(based on UV melting curves)

• Loop energies depend on• loop type • size • base composition

⇒ large energy parameter tablesGCU

UCCG

AA U

UCGGU

GC −3.4

−3.3

+3.5

+1.2

−2.4

Total energy−4.4 kcal/mol

+ distinguish RNA structures by free energy+ define minimum free energy (MFE)minimum free energy (MFE)minimum free energy (MFE)+ basis for entire tool set for RNA structure

- limitations (stay tuned)

RN

AB

ioin

form

ati

cs·

S.

Wil

7

Energies of RNA structures can be calculated

Nearest Neighbor Model (NNM)

• Free energies = sum of loop energies

• Loop energies measured experimentally(based on UV melting curves)

• Loop energies depend on• loop type • size • base composition

⇒ large energy parameter tablesGCU

UCCG

AA U

UCGGU

GC −3.4

−3.3

+3.5

+1.2

−2.4

Total energy−4.4 kcal/mol

+ distinguish RNA structures by free energy+ define minimum free energy (MFE)minimum free energy (MFE)minimum free energy (MFE)+ basis for entire tool set for RNA structure

- limitations (stay tuned)

RN

AB

ioin

form

ati

cs·

S.

Wil

8

From NNM to Structure Prediction

GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA

RN

AB

ioin

form

ati

cs·

S.

Wil

8

From NNM to Structure Prediction

GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA

...

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA G

GGCUA

UUAGCU

CAGU U G G U U A

GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCA

UAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG

CACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G C

GCACC

CCU

GA U

AAGGGUGAGG

UCG C U G A

U U CG

AAUUCAGC

AUAGCCCA G

GGCUAU

UAG

CUC

AG U

UGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGC

AUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGC

UCAGU

U G G U U AGAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUU

AGCUCA

GU U G G U U

A GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G C

GC

AC

CCCU

GA U

AAGGGUG

AGG

U C

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GA G

CG

CACCCC

UG A U

AA

GGGUGA

GGU

C

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G

A U UCG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G CGCACC

CCU

GA U

AAGGGUGAG

GUCG

C U G AU U C

GAAU

UCAGCAUAGCCCA G

GGCUAUUAGCUC

AG U

UGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCU

CAG

UUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AU

UCA

GCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUU

GGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

RN

AB

ioin

form

ati

cs·

S.

Wil

8

From NNM to Structure Prediction

GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA

...

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA G

GGCUA

UUAGCU

CAGU U G G U U A

GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCA

UAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG

CACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G C

GCACC

CCU

GA U

AAGGGUGAGG

UCG C U G A

U U CG

AAUUCAGC

AUAGCCCA G

GGCUAU

UAG

CUC

AG U

UGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGC

AUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGC

UCAGU

U G G U U AGAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUU

AGCUCA

GU U G G U U

A GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G C

GC

AC

CCCU

GA U

AAGGGUG

AGG

U C

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GA G

CG

CACCCC

UG A U

AA

GGGUGA

GGU

C

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G

A U UCG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G CGCACC

CCU

GA U

AAGGGUGAG

GUCG

C U G AU U C

GAAU

UCAGCAUAGCCCA G

GGCUAUUAGCUC

AG U

UGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCU

CAG

UUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AU

UCA

GCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUU

GGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

-25.90 -25.90 -25.90 -26.70 -26.30 -27.00 -27.00 -27.00 -27.80 -26.10 -26.20 -26.20

-26.20 -26.20 -25.90 -25.90 -25.90 -25.90 -26.70 -26.10 -26.60 -26.60 -26.60 -26.00

-27.40 -25.90 -26.50 -26.40 -28.10 -26.40 -28.10 -26.40 -28.10 -26.40 -26.40 -28.90

-27.20 -27.30 -25.90 -26.50 -26.50 -26.20 -26.20 -26.20 -26.20 -27.00 -25.90 -26.00

-26.10 -26.10 -26.60 -26.10 -26.10 -26.60 -27.00 -27.00 -26.70 -26.70 -26.70 -26.70

-27.50 -26.10 -26.50 -26.40 -26.40 -26.10 -26.10 -26.10 -26.10 -26.90

RN

AB

ioin

form

ati

cs·

S.

Wil

8

From NNM to Structure Prediction

GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA

...

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA G

GGCUA

UUAGCU

CAGU U G G U U A

GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCA

UAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG

CACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G C

GCACC

CCU

GA U

AAGGGUGAGG

UCG C U G A

U U CG

AAUUCAGC

AUAGCCCA G

GGCUAU

UAG

CUC

AG U

UGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGC

AUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGC

UCAGU

U G G U U AGAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUU

AGCUCA

GU U G G U U

A GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G C

GC

AC

CCCU

GA U

AAGGGUG

AGG

U C

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GA G

CG

CACCCC

UG A U

AA

GGGUGA

GGU

C

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G

A U UCG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G CGCACC

CCU

GA U

AAGGGUGAG

GUCG

C U G AU U C

GAAU

UCAGCAUAGCCCA G

GGCUAUUAGCUC

AG U

UGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCU

CAG

UUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AU

UCA

GCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUU

GGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

MFE

RN

AB

ioin

form

ati

cs·

S.

Wil

9

Structure Prediction: Fast and Accurate

Performance of RNAFold (Vienna RNA Package 2.0)

102 103 104

Sequence Length

10-3

10-2

10-1

100

101

102

103

104

Run

time

[s]

length 100 in 0.01 s

length 1000 in 1s

[adapted from Vienna RNA Package 2.0, ALMOB 2011]

• Very fast folding algorithms: 0.01 seconds at length 100

• Very useful accuracy: ∼ 70% predicted base pairs correct

RN

AB

ioin

form

ati

cs·

S.

Wil

9

Structure Prediction: Fast and Accurate

Performance of RNAFold (Vienna RNA Package 2.0)

0

0.2

0.4

0.6

0.8

1

16S rRNA

23S rRNA

5S rRNA

7SK RNA

Cili. Telo. RNA

Cis-reg. element

GII Intron

GI Intron

Hairp. Ribozyme

Ham. Ribozyme

IRES

Other Ribozyme

Other RNA

Other rRNA

RNAIII

RNase E 5 UTR

RNase MRP RNA

RNase P RNA

snRNA

SRP RNA

Synthetic RNA

tmRNA

tRNA

Viral

mp; Phage

Y RNA

Sens

itivi

ty

[adapted from Vienna RNA Package 2.0, ALMOB 2011]

• Very fast folding algorithms: 0.01 seconds at length 100• Very useful accuracy: ∼ 70% predicted base pairs correct

RN

AB

ioin

form

ati

cs·

S.

Wil

9

Structure Prediction: Fast and Accurate

Performance of RNAFold (Vienna RNA Package 2.0)

0

0.2

0.4

0.6

0.8

1

16S rRNA

23S rRNA

5S rRNA

7SK RNA

Cili. Telo. RNA

Cis-reg. element

GII Intron

GI Intron

Hairp. Ribozyme

Ham. Ribozyme

IRES

Other Ribozyme

Other RNA

Other rRNA

RNAIII

RNase E 5 UTR

RNase MRP RNA

RNase P RNA

snRNA

SRP RNA

Synthetic RNA

tmRNA

tRNA

Viral

mp; Phage

Y RNA

PPV

[adapted from Vienna RNA Package 2.0, ALMOB 2011]

• Very fast folding algorithms: 0.01 seconds at length 100• Very useful accuracy: ∼ 70% predicted base pairs correct

RN

AB

ioin

form

ati

cs·

S.

Wil

10

Limitations

• Modified bases , , . . .

• Non-canonical base pairs

GGU

CAG

GUCC

GA A

AGGA

AGC

AGCC G

GU

CAG

GUCC

GA A

AGGA

AGC

AGCC

PseudoknotsA

AAA

A

A

AA

A

C

C

C C

C

C

C

C

C

C

UU

U

U

U

UU

U

U U U UUU

G G

G G

G

G

GG C

C

GG

CG

G G

RN

AB

ioin

form

ati

cs·

S.

Wil

11

RNAs Refold at ’Room Temperature’

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA G

GGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

+1.1

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGU

CGC U G A

U U CG

AAUUCAGCA

UAGCCCA

+2.5

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

+0.8

+1.7

MFE

The MFE misleads! Look at

• suboptimal structures

• structure ensembles

• kinetics (co-transcriptional!)

RN

AB

ioin

form

ati

cs·

S.

Wil

12

Suboptimals and Probabilities

GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA

Energies → Structure ProbabilitiesStructure Probabilities → Base Pair Probabilities

RN

AB

ioin

form

ati

cs·

S.

Wil

12

Suboptimals and Probabilities

GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA

G

G

G

C

U

AUU

AGCUC

AGU

U

G

GU U

AG A G C

GC

ACC

C

C

UG

A U

A

A

G

G

G

U

GAGGUCG C U G A

U UC

G

AAU

UCAGC

AU

A

G

C

C

C

A

G

G

G

C

U

AUU

AGCUC

AGU

U

G

GU U

AG A G C

GC

A

CCC

C

UG

A U

A

A

G

G

G

U

GAGGUCG C U G A

U UC

G

AAU

UCAGC

AU

A

G

C

C

C

A

G

G

G

C

U

AUU

AGCUC

AGU

U

G

GU U

AG A G C

GC

A

C

CCC

UG

A U

A

A

G

G

G

U

GAGGUCG C U G A

U UC

G

AAU

UCAGC

AU

A

G

C

C

C

A

G

G

G

C

U

AUU

AGCUC

AGU

U

G

GU U

AG A G C

GC

A

C

C

CC

U

GA U

A

AG

G

G

U

GAGGUCG C U G A

U UC

G

AAU

UCAGC

AU

A

G

C

C

C

A

G

G

G

C

U

A

UUAGCU

CAG

UU

G G U U AG

AG

C

GC

A

C

C

CC

U

GA U

A

AG

G

G

U

GA

G

GU

CG C U G A

U UC

G

AAU

UCAGCA

U

A

G

C

C

C

A

G

G

G

C

U

A

UUA

GCUC

AGU

U

G

GU U

AG A G C

GC

ACC

C

C

UG

A U

A

A

G

G

G

U

GAGGUCG C U G A

U UC

G

AAU

UCAGCA

U

A

G

C

C

C

A

G

G

G

C

U

A

UUA

GCUC

AGU

U

G

GU U

AG A G C

GC

A

CCC

C

UG

A U

A

A

G

G

G

U

GAGGUCG C U G A

U UC

G

AAU

UCAGCA

U

A

G

C

C

C

A

G

G

G

C

U

A

UUA

GCUC

AGU

U

G

GU U

AG A G C

GC

A

C

CCC

UG

A U

A

A

G

G

G

U

GAGGUCG C U G A

U UC

G

AAU

UCAGCA

U

A

G

C

C

C

A

G

G

G

C

U

A

UUA

GCUC

AGU

U

G

GU U

AG A G C

GC

A

C

C

CC

U

GA U

A

AG

G

G

U

GAGGUCG C U G A

U UC

G

AAU

UCAGCA

U

A

G

C

C

C

A

G

G

G

C

U

A

UUA

GCUCAGU

U

GG

U UAG A G C

GC

A

C

C

CC

U

GA U

A

AG

G

G

U

GAGGUCG C U G A

U UC

G

AAU

UCAGCA

U

A

G

C

C

C

A

G

G

G

C

U

A

UUA

GCUC

AGU

U

G

GU U

AG A G C

GCACC

C

C

UG

A U

A

A

G

G

GUGAGGUCG C U G A

U UC

G

AAU

UCAGCA

U

A

G

C

C

C

A G

G

G

C

U

A

UUAG

C

UC

A

GU

U

G

GU

U

AG

A

G

C

G

C

A

C

C

CC

U

GA U

A

AG

G

G

U

GAGGUC

G

C

UG

A

UU

C

G

AAU

UC

A

G

C

A

U

A

G

C

C

C

A

G

G

G

C

U

A

UUAG

C

UC

A

G

U

UG

G

U

U

AG A

G

C

GCA

C

C

C

C

UG

A U

A

A

G

G

GUGAG

G

U

C

G

C

UG

A

U

U

CGA

A

U

U

CA

G

C

A

U

A

G

C

C

C

A

G

G

G

C

U

A

UUAG

C

UC

A

G

U

UG

G

U

U

AG A

G

C

GCA

C

C

C

C

UG

A U

A

A

G

GGUGA

G

G

U

C

G

C

UG

A

U

U

CGA

A

U

U

CA

G

C

A

U

A

G

C

C

C

A

G

G

G

C

U

A

UUAG

C

UCA

G

U

UG

G

U

U

AG A

G

C

G

C

A

C

C

CC

U

GA U

A

AG

G

G

U

GAGGUC

G

C

UG

A

U

U

CGA

A

U

UC

A

G

C

A

U

A

G

C

C

C

A

G

G

G

C

U

A

UUAG

C

UC

A

G

U

U

G

GU

U

AG

A

G

C

G

C

A

C

C

CC

U

GA U

A

AG

G

G

U

GAGGUC

G

C

UG

A

UU

CGA

A

U

U

CA

G

C

A

U

A

G

C

C

C

A

Energies → Structure ProbabilitiesStructure Probabilities → Base Pair Probabilities

RN

AB

ioin

form

ati

cs·

S.

Wil

12

Suboptimals and Probabilities

GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA

...

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA G

GGCUA

UUAGCU

CAGU U G G U U A

GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCA

UAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG

CACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G C

GCACC

CCU

GA U

AAGGGUGAGG

UCG C U G A

U U CG

AAUUCAGC

AUAGCCCA G

GGCUAU

UAG

CUC

AG U

UGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGC

AUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGC

UCAGU

U G G U U AGAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUU

AGCUCA

GU U G G U U

A GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G C

GC

AC

CCCU

GA U

AAGGGUG

AGG

U C

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GA G

CG

CACCCC

UG A U

AA

GGGUGA

GGU

C

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G

A U UCG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G CGCACC

CCU

GA U

AAGGGUGAG

GUCG

C U G AU U C

GAAU

UCAGCAUAGCCCA G

GGCUAUUAGCUC

AG U

UGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCU

CAG

UUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AU

UCA

GCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUU

GGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

-25.90 -25.90 -25.90 -26.70 -26.30 -27.00 -27.00 -27.00 -27.80 -26.10 -26.20 -26.20

-26.20 -26.20 -25.90 -25.90 -25.90 -25.90 -26.70 -26.10 -26.60 -26.60 -26.60 -26.00

-27.40 -25.90 -26.50 -26.40 -28.10 -26.40 -28.10 -26.40 -28.10 -26.40 -26.40 -28.90

-27.20 -27.30 -25.90 -26.50 -26.50 -26.20 -26.20 -26.20 -26.20 -27.00 -25.90 -26.00

-26.10 -26.10 -26.60 -26.10 -26.10 -26.60 -27.00 -27.00 -26.70 -26.70 -26.70 -26.70

-27.50 -26.10 -26.50 -26.40 -26.40 -26.10 -26.10 -26.10 -26.10 -26.90

Energies → Structure Probabilities

Structure Probabilities → Base Pair Probabilities

RN

AB

ioin

form

ati

cs·

S.

Wil

12

Suboptimals and Probabilities

GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA

...

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA

GGGCUAUUAGCUC

AGUUGG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAUA

GCCCA G

GGCUA

UUAGCU

CAGU U G G U U A

GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCA

UAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCCU

GA U

AAGGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

GCACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG

CACCCC

UG A U

AA

GGGUGAGGUC

G C U G AU U C

GAAU

UCAGCAU

AGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G C

GCACC

CCU

GA U

AAGGGUGAGG

UCG C U G A

U U CG

AAUUCAGC

AUAGCCCA G

GGCUAU

UAG

CUC

AG U

UGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGC

AUAGCCCA

GGGCUAU

UAG

CU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGC

AUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCCU

GA U

AAGGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGC

UCAGU

U G G U U AGAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUU

AGCUCA

GU U G G U U

A GAG

CG

CACCCC

UG A U

AA

GGGUG A

GG

UC

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G C

GC

AC

CCCU

GA U

AAGGGUG

AGG

U C

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUAGCU

CAGU U G G U U A

GA G

CG

CACCCC

UG A U

AA

GGGUGA

GGU

C

G C U G AU U C

GAAU

UCAGCAUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCCU

GA U

AAGGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G

A U UCG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUUA

GCUCAGUU

G G U UAG A G CG C

ACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

GGGCUAUU

AGCUC

AGUUGG

U U AG A G CGCACC

CCU

GA U

AAGGGUGAG

GUCG

C U G AU U C

GAAU

UCAGCAUAGCCCA G

GGCUAUUAGCUC

AG U

UGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCUC

AG U

UGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA A

UU

CAGCAUAGCCCA

GGGCUAUUAGCU

CAG

UUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AU

UCA

GCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUG

GUUA

G AGCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUU

GGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGA

AUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGUUAG

AGC

GCACCCC

UG A U

AA

GGGUGA

GGUC

GCUGAUUCGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGG

UCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U A

AGGGUGAG

GUCGCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCCU

GA U

AAGGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

GGGCUAUUAGCU

CAGUUGGU

UAGA

GCGCACCCC

UG A U

AA

GGGUGAGGUC

GCU G

AUU

CGAAUUC

AGCAUAGCCCA

-25.90 -25.90 -25.90 -26.70 -26.30 -27.00 -27.00 -27.00 -27.80 -26.10 -26.20 -26.20

-26.20 -26.20 -25.90 -25.90 -25.90 -25.90 -26.70 -26.10 -26.60 -26.60 -26.60 -26.00

-27.40 -25.90 -26.50 -26.40 -28.10 -26.40 -28.10 -26.40 -28.10 -26.40 -26.40 -28.90

-27.20 -27.30 -25.90 -26.50 -26.50 -26.20 -26.20 -26.20 -26.20 -27.00 -25.90 -26.00

-26.10 -26.10 -26.60 -26.10 -26.10 -26.60 -27.00 -27.00 -26.70 -26.70 -26.70 -26.70

-27.50 -26.10 -26.50 -26.40 -26.40 -26.10 -26.10 -26.10 -26.10 -26.90

Energies → Structure ProbabilitiesStructure Probabilities → Base Pair Probabilities

RN

AB

ioin

form

ati

cs·

S.

Wil

13

Dotplots and Reliabilities

G G G C U A U U A G C U C A G U U G G U U A G A G C G C A C C C C U G A U A A G G G U G A G G U C G C U G A U U C G A A U U C A G C A U A G C C C A

G G G C U A U U A G C U C A G U U G G U U A G A G C G C A C C C C U G A U A A G G G U G A G G U C G C U G A U U C G A A U U C A G C A U A G C C C AGG

GC

UA

UU

AG

CU

CA

GU

UG

GU

UA

GA

GC

GC

AC

CC

CU

GA

UA

AG

GG

UG

AG

GU

CG

CU

GA

UU

CG

AA

UU

CA

GC

AU

AG

CC

CA

GG

GC

UA

UU

AG

CU

CA

GU

UG

GU

UA

GA

GC

GC

AC

CC

CU

GA

UA

AG

GG

UG

AG

GU

CG

CU

GA

UU

CG

AA

UU

CA

GC

AU

AG

CC

CA

GGGCUAUUA

GCUCAGUU

GG

U U AG A G C

G CACCCC

UG A U

AA

GGGUGAGGU

CG C U G A

U U CG

AAUUCAGC

AUAGCCCA

RN

AB

ioin

form

ati

cs·

S.

Wil

13

Dotplots and Reliabilities

G G G C U A U U A G C U C A G U U G G U U A G A G C G C A C C C C U G A U A A G G G U G A G G U C G C U G A U U C G A A U U C A G C A U A G C C C A

G G G C U A U U A G C U C A G U U G G U U A G A G C G C A C C C C U G A U A A G G G U G A G G U C G C U G A U U C G A A U U C A G C A U A G C C C AGG

GC

UA

UU

AG

CU

CA

GU

UG

GU

UA

GA

GC

GC

AC

CC

CU

GA

UA

AG

GG

UG

AG

GU

CG

CU

GA

UU

CG

AA

UU

CA

GC

AU

AG

CC

CA

GG

GC

UA

UU

AG

CU

CA

GU

UG

GU

UA

GA

GC

GC

AC

CC

CU

GA

UA

AG

GG

UG

AG

GU

CG

CU

GA

UU

CG

AA

UU

CA

GC

AU

AG

CC

CA

RN

AB

ioin

form

ati

cs·

S.

Wil

14

Integrating Prior Knowledge

• Knowledge on base pairing:

GCGGAUUUAG

CUCAGUU

GGG

AGAGCGC

C

AGACU

GA

AG

A U CUG

GA

GGUC

C

UGUGUUCGA

UCCA

C

A

GAAUUCGC

A

CCA

←G

C

G

G

AUUUAGCUCAGUUG

GGAG

A

GC G

C C AG A C U G A

A G AU

CU

GG

A GG

UC

CU G

UG

UUC

GA

UC

CAC

AG

A

A

U

U

C

G

CA

CCA

→GCGGAUUU

AGCUC

AGU

U

GG

G AG A G C

GCCA

GA

CU

GA A

G

AUCUGGAGG

UCC U G U G

U UCGA

UCCACAG

AAUUCGC

A

CCA

• Structure probing experiments (e.g. SHAPE)

• HomologyExamp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-

RN

AB

ioin

form

ati

cs·

S.

Wil

14

Integrating Prior Knowledge

• Knowledge on base pairing:

GCGGAUUUAG

CUCAGUU

GGG

AGAGCGC

C

AGACU

GA

AG

A U CUG

GA

GGUC

C

UGUGUUCGA

UCCA

C

A

GAAUUCGC

A

CCA

←G

C

G

G

AUUUAGCUCAGUUG

GGAG

A

GC G

C C AG A C U G A

A G AU

CU

GG

A GG

UC

CU G

UG

UUC

GA

UC

CAC

AG

A

A

U

U

C

G

CA

CCA

→GCGGAUUU

AGCUC

AGU

U

GG

G AG A G C

GCCA

GA

CU

GA A

G

AUCUGGAGG

UCC U G U G

U UCGA

UCCACAG

AAUUCGC

A

CCA

• Structure probing experiments (e.g. SHAPE)

• HomologyExamp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-

RN

AB

ioin

form

ati

cs·

S.

Wil

14

Integrating Prior Knowledge

• Knowledge on base pairing:

GCGGAUUUAG

CUCAGUU

GGG

AGAGCGC

C

AGACU

GA

AG

A U CUG

GA

GGUC

C

UGUGUUCGA

UCCA

C

A

GAAUUCGC

A

CCA

←G

C

G

G

AUUUAGCUCAGUUG

GGAG

A

GC G

C C AG A C U G A

A G AU

CU

GG

A GG

UC

CU G

UG

UUC

GA

UC

CAC

AG

A

A

U

U

C

G

CA

CCA

→GCGGAUUU

AGCUC

AGU

U

GG

G AG A G C

GCCA

GA

CU

GA A

G

AUCUGGAGG

UCC U G U G

U UCGA

UCCACAG

AAUUCGC

A

CCA

• Structure probing experiments (e.g. SHAPE)

• HomologyExamp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-

RN

AB

ioin

form

ati

cs·

S.

Wil

15

Comparative Analysis with Alifold

Examp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-

⇓ Alifold

_ _ _ _ _ _ _ _ _ _ C G C U _ A A _ _ A C C A A C _ _ _ _ _ A G C _ C G C _ _ _ _ _ _ _ G _ G G C G A G A A C _ _

_ _ _ _ _ _ _ _ _ _ C G C U _ A A _ _ A C C A A C _ _ _ _ _ A G C _ C G C _ _ _ _ _ _ _ G _ G G C G A G A A C _ ___

__

__

__

__

CG

CU

_A

A_

_A

CC

AA

C_

__

__

AG

C_

CG

C_

__

__

__

G_

GG

CG

AG

AA

C_

_

__

__

__

__

__

CG

CU

_A

A_

_A

CC

AA

C_

__

__

AG

C_

CG

C_

__

__

__

G_

GG

CG

AG

AA

C_

_

____

____

_ _ CGCUGA

A__

ACCA

AC_

_ _ G _AGC

GCGC___

___GG

_GGCG A

GAAC

__

..........((((((...(((............))).......)))))).......Examp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-

[Fig. adapted from Vienna RNA Package 2.0, ALMOB 2011, alifold exanple]

RN

AB

ioin

form

ati

cs·

S.

Wil

15

Comparative Analysis with Alifold

Examp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-

⇓ Alifold

_ _ _ _ _ _ _ _ _ _ C G C U _ A A _ _ A C C A A C _ _ _ _ _ A G C _ C G C _ _ _ _ _ _ _ G _ G G C G A G A A C _ _

_ _ _ _ _ _ _ _ _ _ C G C U _ A A _ _ A C C A A C _ _ _ _ _ A G C _ C G C _ _ _ _ _ _ _ G _ G G C G A G A A C _ ___

__

__

__

__

CG

CU

_A

A_

_A

CC

AA

C_

__

__

AG

C_

CG

C_

__

__

__

G_

GG

CG

AG

AA

C_

_

__

__

__

__

__

CG

CU

_A

A_

_A

CC

AA

C_

__

__

AG

C_

CG

C_

__

__

__

G_

GG

CG

AG

AA

C_

_

____

____

_ _ CGCUGA

A__

ACCA

AC_

_ _ G _AGC

GCGC___

___GG

_GGCG A

GAAC

__

..........((((((...(((............))).......)))))).......Examp1 ----------CCGG-AAA-CCGAACGCAGCACCGCGG------AU-CUGGAACGC--Examp2 ----------CGCU-AG--AACAAC-------UAUCU------GU-AGCGCGAAAACExamp3 ---------AUUGUGUA--GCAUU------AGUUUGC-------GUGCAAAGAACGCExamp4 -------UGCCAUCGCAUUAGCACC---U-AGCCGCAUUUUCUGGCGAUGAUG----Examp5 AGCACCGAACCGCAU----GCGAACUGAG-AA--CGCAACC----AUGCGCGCACC-

[Fig. adapted from Vienna RNA Package 2.0, ALMOB 2011, alifold exanple]

RN

AB

ioin

form

ati

cs·

S.

Wil

16

Simultaneous Alignment and Folding(with LocARNA)

g

c

a

g

u

c

gu

g

gcc

gagu

g

g

uu a a

g g c

gu

cu

gac

u

cg a

a

a

ucagau u

cc

c

u c

ug

gg

ag

c

g u a g gu u

c

gaa

u

ccuacc

g

g

c

u

g

c

g

g

ccggggugg

ggu

a

g

ug g

c c a u c c u g gg

gg

ac

ugug

ga

uc c

cc

ug a

c

ccg

gguu

caau

uc

cc

gg

uc

cc

g

g

cc

RN

AB

ioin

form

ati

cs·

S.

Wil

16

Simultaneous Alignment and Folding(with LocARNA)

AC021639.5_181586-181505

g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c g

g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c ggc

ag

uc

gu

gg

cc

ga

gu

gg

uu

aa

gg

cg

uc

ug

ac

uc

ga

aa

uc

ag

au

uc

cc

uc

ug

gg

ag

cg

ua

gg

uu

cg

aa

uc

cu

ac

cg

gc

ug

cg

gc

ag

uc

gu

gg

cc

ga

gu

gg

uu

aa

gg

cg

uc

ug

ac

uc

ga

aa

uc

ag

au

uc

cc

uc

ug

gg

ag

cg

ua

gg

uu

cg

aa

uc

cu

ac

cg

gc

ug

cg

U67517.1_7511-7582

g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c c

g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c cgc

cg

gg

gu

gg

gg

ua

gu

gg

cc

au

cc

ug

gg

gg

ac

ug

ug

ga

uc

cc

cu

ga

cc

cg

gg

uu

ca

au

uc

cc

gg

uc

cc

gg

cc

gc

cg

gg

gu

gg

gg

ua

gu

gg

cc

au

cc

ug

gg

gg

ac

ug

ug

ga

uc

cc

cu

ga

cc

cg

gg

uu

ca

au

uc

cc

gg

uc

cc

gg

cc

RN

AB

ioin

form

ati

cs·

S.

Wil

17

Simultaneous Alignment and Folding(with LocARNA)

g

c

a

g

u

c

gu

g

gcc

gagu

g

g

uu a a

g g c

gu

cu

gac

u

cg a

a

a

ucagau u

cc

c

u c

ug

gg

ag

c

g u a g gu u

c

gaa

u

ccuacc

g

g

c

u

g

c

g

g

c

g

g

g

g

gu

g

cccgagccuggcc

aa

ag g

g g u c g g g c u c ag g

acccgaug

gc

gu

a

ggc

cugcg u g g g

u uc

aaa

u

cccacc

c

c

c

c

g

c

a

u

g

g

a

g

u

aua

gccaa

gu g g u

aa g

g

c

a

u

c

g

g

uu

u

uu g

g

ua

c

c

ggca

u

g

ca a a g g

u uc

g

aau

ccuuuu

a

c

u

c

c

a

g

a

g

u

a

a

a

gu

c

agcuaa

a

aa

a g c uu

u

u

g

g

gc

c

ca u

a

cc

c

c

a

a

a c a uguug g u

ua

aacc

cc

uucc

u

u

u

a

c

u

a

g

ccggggugg

ggu

a

g

ug g

c c a u c c u g gg

gg

ac

ugug

ga

uc c

cc

ug a

c

ccg

gguu

caau

uc

cc

gg

uc

cc

g

g

cc

c

g

g

a

a

a

guagcu

uagcuu

gg

ua

g a g ca

c

u

c

g

g

u

u

ug

g

g

a

c

c

g

a

g g ggucg c a g g

u uc

g

aau

ccuguc

u

u

u

c

c

g

a

gu

aa

a

cauaguuuaauca

a

aa c

a u u a g a u u g u g

aa

uc u a a

ca

a

u

a g a g gc u

c

g

aaa

ccucu

ug

cu

uacc

AC021639.5_181586-181505

g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c g

g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c ggc

ag

uc

gu

gg

cc

ga

gu

gg

uu

aa

gg

cg

uc

ug

ac

uc

ga

aa

uc

ag

au

uc

cc

uc

ug

gg

ag

cg

ua

gg

uu

cg

aa

uc

cu

ac

cg

gc

ug

cg

gc

ag

uc

gu

gg

cc

ga

gu

gg

uu

aa

gg

cg

uc

ug

ac

uc

ga

aa

uc

ag

au

uc

cc

uc

ug

gg

ag

cg

ua

gg

uu

cg

aa

uc

cu

ac

cg

gc

ug

cg

AP000063.1_59179-59095

g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c a

g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c agc

gg

gg

gu

gc

cc

ga

gc

cu

gg

cc

aa

ag

gg

gu

cg

gg

cu

ca

gg

ac

cc

ga

ug

gc

gu

ag

gc

cu

gc

gu

gg

gu

uc

aa

au

cc

ca

cc

cc

cc

gc

a

gc

gg

gg

gu

gc

cc

ga

gc

cu

gg

cc

aa

ag

gg

gu

cg

gg

cu

ca

gg

ac

cc

ga

ug

gc

gu

ag

gc

cu

gc

gu

gg

gu

uc

aa

au

cc

ca

cc

cc

cc

gc

a

AP000397.1_114390-114319

u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a g

u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a gug

ga

gu

au

ag

cc

aa

gu

gg

ua

ag

gc

au

cg

gu

uu

uu

gg

ua

cc

gg

ca

ug

ca

aa

gg

uu

cg

aa

uc

cu

uu

ua

cu

cc

ag

ug

ga

gu

au

ag

cc

aa

gu

gg

ua

ag

gc

au

cg

gu

uu

uu

gg

ua

cc

gg

ca

ug

ca

aa

gg

uu

cg

aa

uc

cu

uu

ua

cu

cc

ag

M10217.1_5910-5978

a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u a

a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u aag

ua

aa

gu

ca

gc

ua

aa

aa

ag

cu

uu

ug

gg

cc

ca

ua

cc

cc

aa

ac

au

gu

ug

gu

ua

aa

cc

cc

uu

cc

uu

ua

cu

a

ag

ua

aa

gu

ca

gc

ua

aa

aa

ag

cu

uu

ug

gg

cc

ca

ua

cc

cc

aa

ac

au

gu

ug

gu

ua

aa

cc

cc

uu

cc

uu

ua

cu

a

U67517.1_7511-7582

g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c c

g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c cgc

cg

gg

gu

gg

gg

ua

gu

gg

cc

au

cc

ug

gg

gg

ac

ug

ug

ga

uc

cc

cu

ga

cc

cg

gg

uu

ca

au

uc

cc

gg

uc

cc

gg

cc

gc

cg

gg

gu

gg

gg

ua

gu

gg

cc

au

cc

ug

gg

gg

ac

ug

ug

ga

uc

cc

cu

ga

cc

cg

gg

uu

ca

au

uc

cc

gg

uc

cc

gg

cc

X03715.1_388-461

c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g a

c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g acg

ga

aa

gu

ag

cu

ua

gc

uu

gg

ua

ga

gc

ac

uc

gg

uu

ug

gg

ac

cg

ag

gg

gu

cg

ca

gg

uu

cg

aa

uc

cu

gu

cu

uu

cc

ga

cg

ga

aa

gu

ag

cu

ua

gc

uu

gg

ua

ga

gc

ac

uc

gg

uu

ug

gg

ac

cg

ag

gg

gu

cg

ca

gg

uu

cg

aa

uc

cu

gu

cu

uu

cc

ga

X99256.1_11558-11626

g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c c

g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c cgu

aa

ac

au

ag

uu

ua

au

ca

aa

ac

au

ua

ga

uu

gu

ga

au

cu

aa

ca

au

ag

ag

gc

uc

ga

aa

cc

uc

uu

gc

uu

ac

c

gu

aa

ac

au

ag

uu

ua

au

ca

aa

ac

au

ua

ga

uu

gu

ga

au

cu

aa

ca

au

ag

ag

gc

uc

ga

aa

cc

uc

uu

gc

uu

ac

c

(((((((..(((.............))).(((((.......)))))..............

AC021639.5_181586-181505 GCAGUCGUGGCCGAGU---GGUUAAGGCGUCUGACUCGAAAUCAGAUUCCCUCUGGGAGC 57AP000063.1_59179-59095 GCGGGGGUGCCCGAGCCUGGCCAAAGGGGUCGGGCUCAGGACCCGAUGGCGUAGGCCUGC 60AP000397.1_114390-114319 UGGAGUAUAGCCAAG--UGG--UAAGGCAUCGGUUUUUGGUACCG---------GCAUGC 47X03715.1_388-461 CGGAAAGUAGCUUAGCUUGG--UAGAGCACUCGGUUUGGGACCGA---------GGGGUC 49U67517.1_7511-7582 GCCGGGGUGGGGUAGUGGCCAUCCUGG---GGGACUGUGGAUCCC----------CUGAC 47X99256.1_11558-11626 GUAAACAUAGUUUA------AUCAAAACAUUAGAUUGUGAAUCUAA----------CAAU 44M10217.1_5910-5978 AGUAAAGUCAGCUA------AAAAAGCUUUUGGGCCCAUACCCCAA----------ACAU 44

.........10........20........30........40........50........6

(((((.......)))))))))))).

AC021639.5_181586-181505 GUAGGUUCGAAUCCUACCGGCUGCG 82AP000063.1_59179-59095 GUGGGUUCAAAUCCCACCCCCCGCA 85AP000397.1_114390-114319 AAAGGUUCGAAUCCUUUUACUCCAG 72X03715.1_388-461 GCAGGUUCGAAUCCUGUCUUUCCGA 74U67517.1_7511-7582 CCGGGUUCAAUUCCCGGUCCCGGCC 72X99256.1_11558-11626 AGAGGCUCGAAACCUCUUGCUUACC 69M10217.1_5910-5978 GUUGGUUAAACCCCUUCCUUUACUA 69

0........70........80....

GSRRRVR

URGSY

KA

gy-u

gga

u H AA

R G c

ru

YG

GRY

UB

D GRA

YCCRa

u--

c - u --gs

VD

RYR Y R G GU U

CR

AAU

CCYDYYBYYYSC

V

YR

cG

GS

RY

au

DY

YR

RN

AB

ioin

form

ati

cs·

S.

Wil

17

Simultaneous Alignment and Folding(with LocARNA)

g

c

a

g

u

c

gu

g

gcc

gagu

g

g

uu a a

g g c

gu

cu

gac

u

cg a

a

a

ucagau u

cc

c

u c

ug

gg

ag

c

g u a g gu u

c

gaa

u

ccuacc

g

g

c

u

g

c

g

g

c

g

g

g

g

gu

g

cccgagccuggcc

aa

ag g

g g u c g g g c u c ag g

acccgaug

gc

gu

a

ggc

cugcg u g g g

u uc

aaa

u

cccacc

c

c

c

c

g

c

a

u

g

g

a

g

u

aua

gccaa

gu g g u

aa g

g

c

a

u

c

g

g

uu

u

uu g

g

ua

c

c

ggca

u

g

ca a a g g

u uc

g

aau

ccuuuu

a

c

u

c

c

a

g

a

g

u

a

a

a

gu

c

agcuaa

a

aa

a g c uu

u

u

g

g

gc

c

ca u

a

cc

c

c

a

a

a c a uguug g u

ua

aacc

cc

uucc

u

u

u

a

c

u

a

g

ccggggugg

ggu

a

g

ug g

c c a u c c u g gg

gg

ac

ugug

ga

uc c

cc

ug a

c

ccg

gguu

caau

uc

cc

gg

uc

cc

g

g

cc

c

g

g

a

a

a

guagcu

uagcuu

gg

ua

g a g ca

c

u

c

g

g

u

u

ug

g

g

a

c

c

g

a

g g ggucg c a g g

u uc

g

aau

ccuguc

u

u

u

c

c

g

a

gu

aa

a

cauaguuuaauca

a

aa c

a u u a g a u u g u g

aa

uc u a a

ca

a

u

a g a g gc u

c

g

aaa

ccucu

ug

cu

uacc

AC021639.5_181586-181505

g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c g

g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c ggc

ag

uc

gu

gg

cc

ga

gu

gg

uu

aa

gg

cg

uc

ug

ac

uc

ga

aa

uc

ag

au

uc

cc

uc

ug

gg

ag

cg

ua

gg

uu

cg

aa

uc

cu

ac

cg

gc

ug

cg

gc

ag

uc

gu

gg

cc

ga

gu

gg

uu

aa

gg

cg

uc

ug

ac

uc

ga

aa

uc

ag

au

uc

cc

uc

ug

gg

ag

cg

ua

gg

uu

cg

aa

uc

cu

ac

cg

gc

ug

cg

AP000063.1_59179-59095

g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c a

g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c agc

gg

gg

gu

gc

cc

ga

gc

cu

gg

cc

aa

ag

gg

gu

cg

gg

cu

ca

gg

ac

cc

ga

ug

gc

gu

ag

gc

cu

gc

gu

gg

gu

uc

aa

au

cc

ca

cc

cc

cc

gc

a

gc

gg

gg

gu

gc

cc

ga

gc

cu

gg

cc

aa

ag

gg

gu

cg

gg

cu

ca

gg

ac

cc

ga

ug

gc

gu

ag

gc

cu

gc

gu

gg

gu

uc

aa

au

cc

ca

cc

cc

cc

gc

a

AP000397.1_114390-114319

u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a g

u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a gug

ga

gu

au

ag

cc

aa

gu

gg

ua

ag

gc

au

cg

gu

uu

uu

gg

ua

cc

gg

ca

ug

ca

aa

gg

uu

cg

aa

uc

cu

uu

ua

cu

cc

ag

ug

ga

gu

au

ag

cc

aa

gu

gg

ua

ag

gc

au

cg

gu

uu

uu

gg

ua

cc

gg

ca

ug

ca

aa

gg

uu

cg

aa

uc

cu

uu

ua

cu

cc

ag

M10217.1_5910-5978

a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u a

a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u aag

ua

aa

gu

ca

gc

ua

aa

aa

ag

cu

uu

ug

gg

cc

ca

ua

cc

cc

aa

ac

au

gu

ug

gu

ua

aa

cc

cc

uu

cc

uu

ua

cu

a

ag

ua

aa

gu

ca

gc

ua

aa

aa

ag

cu

uu

ug

gg

cc

ca

ua

cc

cc

aa

ac

au

gu

ug

gu

ua

aa

cc

cc

uu

cc

uu

ua

cu

a

U67517.1_7511-7582

g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c c

g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c cgc

cg

gg

gu

gg

gg

ua

gu

gg

cc

au

cc

ug

gg

gg

ac

ug

ug

ga

uc

cc

cu

ga

cc

cg

gg

uu

ca

au

uc

cc

gg

uc

cc

gg

cc

gc

cg

gg

gu

gg

gg

ua

gu

gg

cc

au

cc

ug

gg

gg

ac

ug

ug

ga

uc

cc

cu

ga

cc

cg

gg

uu

ca

au

uc

cc

gg

uc

cc

gg

cc

X03715.1_388-461

c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g a

c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g acg

ga

aa

gu

ag

cu

ua

gc

uu

gg

ua

ga

gc

ac

uc

gg

uu

ug

gg

ac

cg

ag

gg

gu

cg

ca

gg

uu

cg

aa

uc

cu

gu

cu

uu

cc

ga

cg

ga

aa

gu

ag

cu

ua

gc

uu

gg

ua

ga

gc

ac

uc

gg

uu

ug

gg

ac

cg

ag

gg

gu

cg

ca

gg

uu

cg

aa

uc

cu

gu

cu

uu

cc

ga

X99256.1_11558-11626

g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c c

g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c cgu

aa

ac

au

ag

uu

ua

au

ca

aa

ac

au

ua

ga

uu

gu

ga

au

cu

aa

ca

au

ag

ag

gc

uc

ga

aa

cc

uc

uu

gc

uu

ac

c

gu

aa

ac

au

ag

uu

ua

au

ca

aa

ac

au

ua

ga

uu

gu

ga

au

cu

aa

ca

au

ag

ag

gc

uc

ga

aa

cc

uc

uu

gc

uu

ac

c

(((((((..(((.............))).(((((.......)))))..............

AC021639.5_181586-181505 GCAGUCGUGGCCGAGU---GGUUAAGGCGUCUGACUCGAAAUCAGAUUCCCUCUGGGAGC 57AP000063.1_59179-59095 GCGGGGGUGCCCGAGCCUGGCCAAAGGGGUCGGGCUCAGGACCCGAUGGCGUAGGCCUGC 60AP000397.1_114390-114319 UGGAGUAUAGCCAAG--UGG--UAAGGCAUCGGUUUUUGGUACCG---------GCAUGC 47X03715.1_388-461 CGGAAAGUAGCUUAGCUUGG--UAGAGCACUCGGUUUGGGACCGA---------GGGGUC 49U67517.1_7511-7582 GCCGGGGUGGGGUAGUGGCCAUCCUGG---GGGACUGUGGAUCCC----------CUGAC 47X99256.1_11558-11626 GUAAACAUAGUUUA------AUCAAAACAUUAGAUUGUGAAUCUAA----------CAAU 44M10217.1_5910-5978 AGUAAAGUCAGCUA------AAAAAGCUUUUGGGCCCAUACCCCAA----------ACAU 44

.........10........20........30........40........50........6

(((((.......)))))))))))).

AC021639.5_181586-181505 GUAGGUUCGAAUCCUACCGGCUGCG 82AP000063.1_59179-59095 GUGGGUUCAAAUCCCACCCCCCGCA 85AP000397.1_114390-114319 AAAGGUUCGAAUCCUUUUACUCCAG 72X03715.1_388-461 GCAGGUUCGAAUCCUGUCUUUCCGA 74U67517.1_7511-7582 CCGGGUUCAAUUCCCGGUCCCGGCC 72X99256.1_11558-11626 AGAGGCUCGAAACCUCUUGCUUACC 69M10217.1_5910-5978 GUUGGUUAAACCCCUUCCUUUACUA 69

0........70........80....

GSRRRVR

URGSY

KA

gy-u

gga

u H AA

R G c

ru

YG

GRY

UB

D GRA

YCCRa

u--

c - u --gs

VD

RYR Y R G GU U

CR

AAU

CCYDYYBYYYSC

V

YR

cG

GS

RY

au

DY

YR

RN

AB

ioin

form

ati

cs·

S.

Wil

17

Simultaneous Alignment and Folding(with LocARNA)

g

c

a

g

u

c

gu

g

gcc

gagu

g

g

uu a a

g g c

gu

cu

gac

u

cg a

a

a

ucagau u

cc

c

u c

ug

gg

ag

c

g u a g gu u

c

gaa

u

ccuacc

g

g

c

u

g

c

g

g

c

g

g

g

g

gu

g

cccgagccuggcc

aa

ag g

g g u c g g g c u c ag g

acccgaug

gc

gu

a

ggc

cugcg u g g g

u uc

aaa

u

cccacc

c

c

c

c

g

c

a

u

g

g

a

g

u

aua

gccaa

gu g g u

aa g

g

c

a

u

c

g

g

uu

u

uu g

g

ua

c

c

ggca

u

g

ca a a g g

u uc

g

aau

ccuuuu

a

c

u

c

c

a

g

a

g

u

a

a

a

gu

c

agcuaa

a

aa

a g c uu

u

u

g

g

gc

c

ca u

a

cc

c

c

a

a

a c a uguug g u

ua

aacc

cc

uucc

u

u

u

a

c

u

a

g

ccggggugg

ggu

a

g

ug g

c c a u c c u g gg

gg

ac

ugug

ga

uc c

cc

ug a

c

ccg

gguu

caau

uc

cc

gg

uc

cc

g

g

cc

c

g

g

a

a

a

guagcu

uagcuu

gg

ua

g a g ca

c

u

c

g

g

u

u

ug

g

g

a

c

c

g

a

g g ggucg c a g g

u uc

g

aau

ccuguc

u

u

u

c

c

g

a

gu

aa

a

cauaguuuaauca

a

aa c

a u u a g a u u g u g

aa

uc u a a

ca

a

u

a g a g gc u

c

g

aaa

ccucu

ug

cu

uacc

AC021639.5_181586-181505

g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c g

g c a g u c g u g g c c g a g u g g u u a a g g c g u c u g a c u c g a a a u c a g a u u c c c u c u g g g a g c g u a g g u u c g a a u c c u a c c g g c u g c ggc

ag

uc

gu

gg

cc

ga

gu

gg

uu

aa

gg

cg

uc

ug

ac

uc

ga

aa

uc

ag

au

uc

cc

uc

ug

gg

ag

cg

ua

gg

uu

cg

aa

uc

cu

ac

cg

gc

ug

cg

gc

ag

uc

gu

gg

cc

ga

gu

gg

uu

aa

gg

cg

uc

ug

ac

uc

ga

aa

uc

ag

au

uc

cc

uc

ug

gg

ag

cg

ua

gg

uu

cg

aa

uc

cu

ac

cg

gc

ug

cg

AP000063.1_59179-59095

g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c a

g c g g g g g u g c c c g a g c c u g g c c a a a g g g g u c g g g c u c a g g a c c c g a u g g c g u a g g c c u g c g u g g g u u c a a a u c c c a c c c c c c g c agc

gg

gg

gu

gc

cc

ga

gc

cu

gg

cc

aa

ag

gg

gu

cg

gg

cu

ca

gg

ac

cc

ga

ug

gc

gu

ag

gc

cu

gc

gu

gg

gu

uc

aa

au

cc

ca

cc

cc

cc

gc

a

gc

gg

gg

gu

gc

cc

ga

gc

cu

gg

cc

aa

ag

gg

gu

cg

gg

cu

ca

gg

ac

cc

ga

ug

gc

gu

ag

gc

cu

gc

gu

gg

gu

uc

aa

au

cc

ca

cc

cc

cc

gc

a

AP000397.1_114390-114319

u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a g

u g g a g u a u a g c c a a g u g g u a a g g c a u c g g u u u u u g g u a c c g g c a u g c a a a g g u u c g a a u c c u u u u a c u c c a gug

ga

gu

au

ag

cc

aa

gu

gg

ua

ag

gc

au

cg

gu

uu

uu

gg

ua

cc

gg

ca

ug

ca

aa

gg

uu

cg

aa

uc

cu

uu

ua

cu

cc

ag

ug

ga

gu

au

ag

cc

aa

gu

gg

ua

ag

gc

au

cg

gu

uu

uu

gg

ua

cc

gg

ca

ug

ca

aa

gg

uu

cg

aa

uc

cu

uu

ua

cu

cc

ag

M10217.1_5910-5978

a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u a

a g u a a a g u c a g c u a a a a a a g c u u u u g g g c c c a u a c c c c a a a c a u g u u g g u u a a a c c c c u u c c u u u a c u aag

ua

aa

gu

ca

gc

ua

aa

aa

ag

cu

uu

ug

gg

cc

ca

ua

cc

cc

aa

ac

au

gu

ug

gu

ua

aa

cc

cc

uu

cc

uu

ua

cu

a

ag

ua

aa

gu

ca

gc

ua

aa

aa

ag

cu

uu

ug

gg

cc

ca

ua

cc

cc

aa

ac

au

gu

ug

gu

ua

aa

cc

cc

uu

cc

uu

ua

cu

a

U67517.1_7511-7582

g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c c

g c c g g g g u g g g g u a g u g g c c a u c c u g g g g g a c u g u g g a u c c c c u g a c c c g g g u u c a a u u c c c g g u c c c g g c cgc

cg

gg

gu

gg

gg

ua

gu

gg

cc

au

cc

ug

gg

gg

ac

ug

ug

ga

uc

cc

cu

ga

cc

cg

gg

uu

ca

au

uc

cc

gg

uc

cc

gg

cc

gc

cg

gg

gu

gg

gg

ua

gu

gg

cc

au

cc

ug

gg

gg

ac

ug

ug

ga

uc

cc

cu

ga

cc

cg

gg

uu

ca

au

uc

cc

gg

uc

cc

gg

cc

X03715.1_388-461

c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g a

c g g a a a g u a g c u u a g c u u g g u a g a g c a c u c g g u u u g g g a c c g a g g g g u c g c a g g u u c g a a u c c u g u c u u u c c g acg

ga

aa

gu

ag

cu

ua

gc

uu

gg

ua

ga

gc

ac

uc

gg

uu

ug

gg

ac

cg

ag

gg

gu

cg

ca

gg

uu

cg

aa

uc

cu

gu

cu

uu

cc

ga

cg

ga

aa

gu

ag

cu

ua

gc

uu

gg

ua

ga

gc

ac

uc

gg

uu

ug

gg

ac

cg

ag

gg

gu

cg

ca

gg

uu

cg

aa

uc

cu

gu

cu

uu

cc

ga

X99256.1_11558-11626

g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c c

g u a a a c a u a g u u u a a u c a a a a c a u u a g a u u g u g a a u c u a a c a a u a g a g g c u c g a a a c c u c u u g c u u a c cgu

aa

ac

au

ag

uu

ua

au

ca

aa

ac

au

ua

ga

uu

gu

ga

au

cu

aa

ca

au

ag

ag

gc

uc

ga

aa

cc

uc

uu

gc

uu

ac

c

gu

aa

ac

au

ag

uu

ua

au

ca

aa

ac

au

ua

ga

uu

gu

ga

au

cu

aa

ca

au

ag

ag

gc

uc

ga

aa

cc

uc

uu

gc

uu

ac

c

(((((((..(((.............))).(((((.......)))))..............

AC021639.5_181586-181505 GCAGUCGUGGCCGAGU---GGUUAAGGCGUCUGACUCGAAAUCAGAUUCCCUCUGGGAGC 57AP000063.1_59179-59095 GCGGGGGUGCCCGAGCCUGGCCAAAGGGGUCGGGCUCAGGACCCGAUGGCGUAGGCCUGC 60AP000397.1_114390-114319 UGGAGUAUAGCCAAG--UGG--UAAGGCAUCGGUUUUUGGUACCG---------GCAUGC 47X03715.1_388-461 CGGAAAGUAGCUUAGCUUGG--UAGAGCACUCGGUUUGGGACCGA---------GGGGUC 49U67517.1_7511-7582 GCCGGGGUGGGGUAGUGGCCAUCCUGG---GGGACUGUGGAUCCC----------CUGAC 47X99256.1_11558-11626 GUAAACAUAGUUUA------AUCAAAACAUUAGAUUGUGAAUCUAA----------CAAU 44M10217.1_5910-5978 AGUAAAGUCAGCUA------AAAAAGCUUUUGGGCCCAUACCCCAA----------ACAU 44

.........10........20........30........40........50........6

(((((.......)))))))))))).

AC021639.5_181586-181505 GUAGGUUCGAAUCCUACCGGCUGCG 82AP000063.1_59179-59095 GUGGGUUCAAAUCCCACCCCCCGCA 85AP000397.1_114390-114319 AAAGGUUCGAAUCCUUUUACUCCAG 72X03715.1_388-461 GCAGGUUCGAAUCCUGUCUUUCCGA 74U67517.1_7511-7582 CCGGGUUCAAUUCCCGGUCCCGGCC 72X99256.1_11558-11626 AGAGGCUCGAAACCUCUUGCUUACC 69M10217.1_5910-5978 GUUGGUUAAACCCCUUCCUUUACUA 69

0........70........80....

GSRRRVR

URGSY

KA

gy-u

gga

u H AA

R G c

ru

YG

GRY

UB

D GRA

YCCRa

u--

c - u --gs

VD

RYR Y R G GU U

CR

AAU

CCYDYYBYYYSC

V

YR

cG

GS

RY

au

DY

YR

RN

AB

ioin

form

ati

cs·

S.

Wil

18

Interaction Prediction

CGCUAG

AACA

A C U A U CUG UAG C G C G

AAAA C AGC

AC C G

AA

CCGCA

U G C G A A CU

GAGA

ACGCAACCAU

GCGCGCAC

C

• Similar to structure prediction: use NNM!• Predict intra- and inter-molecular structure

• strong restrictions (cofold), no KHP → fast• more freedom (Alkan et al.), KHP → slow

• IntaRNA: reasonable abstraction → fast• Use unpairing probabilities• E.g. genome-wide prediction of sRNA targets

[Cofold example figure adapted from Vienna RNA Package 2.0, ALMOB 2011]

RN

AB

ioin

form

ati

cs·

S.

Wil

18

Interaction Prediction

CGCUAG

AACA

A C U A U CUG UAG C G C G

AAAA C AGC

AC C G

AA

CCGCA

U G C G A A CU

GAGA

ACGCAACCAU

GCGCGCAC

C

• Similar to structure prediction: use NNM!

• Predict intra- and inter-molecular structure• strong restrictions (cofold), no KHP → fast• more freedom (Alkan et al.), KHP → slow

• IntaRNA: reasonable abstraction → fast• Use unpairing probabilities• E.g. genome-wide prediction of sRNA targets

[Cofold example figure adapted from Vienna RNA Package 2.0, ALMOB 2011]

RN

AB

ioin

form

ati

cs·

S.

Wil

18

Interaction Prediction

CGCUAG

AACA

A C U A U CUG UAG C G C G

AAAA C AGC

AC C G

AA

CCGCA

U G C G A A CU

GAGA

ACGCAACCAU

GCGCGCAC

C

• Similar to structure prediction: use NNM!• Predict intra- and inter-molecular structure

• strong restrictions (cofold), no KHP → fast• more freedom (Alkan et al.), KHP → slow

• IntaRNA: reasonable abstraction → fast• Use unpairing probabilities• E.g. genome-wide prediction of sRNA targets

[Cofold example figure adapted from Vienna RNA Package 2.0, ALMOB 2011]

RN

AB

ioin

form

ati

cs·

S.

Wil

18

Interaction Prediction

CGCUAG

AACA

A C U A U CUG UAG C G C G

AAAA C AGC

AC C G

AA

CCGCA

U G C G A A CU

GAGA

ACGCAACCAU

GCGCGCAC

C

• Similar to structure prediction: use NNM!• Predict intra- and inter-molecular structure

• strong restrictions (cofold), no KHP → fast• more freedom (Alkan et al.), KHP → slow

• IntaRNA: reasonable abstraction → fast• Use unpairing probabilities• E.g. genome-wide prediction of sRNA targets

[Cofold example figure adapted from Vienna RNA Package 2.0, ALMOB 2011]

RN

AB

ioin

form

ati

cs·

S.

Wil

19

RNA Bioinformatics—Take home

• Secondary structure as proxy for RNA function

• Nearest neighbor model (NNM) enablesprediction of MFE structures and probabilities

• Solid fundament to construct methods for• Integrating prior knowledge• Simultaneous alignment and folding• Prediction of RNA interactions• . . . pseudoknots, modifications, non-canonical base pairs,

3D structure, kinetics, design

• Building blocks of pipelines to learn about RNA functione.g. sRNA target prediction