19
Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George Krivov Georgii Krivov, Maxim Shapovalov and Roland L. Dunbrack Jr.

Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Embed Size (px)

Citation preview

Page 1: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Development of SCWRL4 for improved prediction

of protein side-chain conformations

In collaboration with Moscow Engineering & Physics Institute

© George Krivov

Georgii Krivov,

Maxim Shapovalov and Roland L. Dunbrack Jr.

Page 2: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

SCWRL® program

SCWRL

minTotalE

Version 3 was written by Adrian A. Canutescu andDr. Roland L. Dunbrack Jr.

Main assumption:

backbone

There is a finite set of possible conformations (called rotamers) for each amino acid residue.

pair-wise interactions

Page 3: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

1. Obtain spatial backbone structure and aminoacid sequence

2. For each residue build possible side-chain conformations (rotamers) using rotamer library

3. Build interaction graph

– each vertex denotes a certain residue

– an edge between vertices indicates that there is an interaction between some rotamers of the corresponding residues

4. Find optimal assignment of side-chain conformations by graph decomposition and dynamic programming

5. Save resolved structure into file

SCWRL’s sidechains packing algorithm

PDB

PDB

Res 2

Res 1

Res 3

Res 4

Page 4: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Inproved dynamic programmingInproved dynamic programming

A tree-decomposition of a graph is a pair , where

based on a tree-decomposition of the interaction graph

,V EG ,T X

,I FT – is a tree with a set of vertices I and a set of edges F

:i i i IX X VX – is a family of subsets of the set V,

associated with the vertices of T,

such that

which satisfies the conditions: i I

iX V

( , ) ,: iu v E i u v XI

a set of vertices

is connected in

: i

V

i I v X

v

T

h k h lb c a

c d b

e g f d c

e d

e h i

g e

e g h

a

c b

d

e

if

g h

kl

Res 2

Res 1

Res 3

Res 4

Hans L. Bodlaender. 1992

Page 5: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

combinatorial complexity blows-up

hardly feasible even with new

algorithm

Involve more rotamers more interaction to evaluate more combinations to enumerate

More realistic potentials longer interaction range

more edges in the graph less decomposable

Described combinatorial algorithm…Described combinatorial algorithm…

resolves global optimum(avoids stochastics)

accuracy of prediction entirely depends on rotamer library and energy potentials

is capable of larger and denser graphs than one based on biconnected components

SCWRL4 is capable of significantly larger proteins than SCWRL3

typically finishes pretty quickly no coffee-breaks…

However… However… a better accuracy is desired

quick collision detection algorithm

thermodynamic fluctuations

via Flexible Rotamers Model

Page 6: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

combinatorial complexity blows-up

hardly feasible even with new

algorithm

Involve more rotamers more interaction to evaluate more combinations to enumerate

More realistic potentials longer interaction range

more edges in the graph less decomposable

Hierarchies of bounding boxes enable efficient search for intersections between two groups of geometric figures

2. Check each combination for overlapping

4. Continue recursively on each clashing pair

1. If overlap then split each

3. Disregard boxes that don’t clash

Given two groups of figures enclosed into k-dops…

quick collision detection algorithm

James T. Klosowski, et.al. 1998

Page 7: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Cubic (k = 3) Tetrahedral (k = 4)

k = 2 k = 3 k = 4

examples:

combinatorial complexity blows-up

hardly feasible even with new

algorithm

quick collision detection algorithm

… works best in conjunction with

k-Discrete Oriented Polytopes

– a class of convex polytopes with 2k planesany plane is orthogonal to one of k basic axes which remain fixed

– easy to enclose a ball – easy to merge– easy clash check– almost rotatable

Involve more rotamers more interaction to evaluate more combinations to enumerate

More realistic potentials longer interaction range

more edges in the graph less decomposable

Page 8: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Cubic (k = 3) Tetrahedral (k = 4)

k = 2 k = 3 k = 4

examples:

combinatorial complexity blows-up

hardly feasible even with new

algorithm

basic axis x i

min i

max i

i = 1..k

simple projection onto all basic axes

easy to enclose a ball easy to merge easy clash check almost rotatable

quick collision detection algorithm

… works best in conjunction with

k-Discrete Oriented Polytopes

– a class of convex polytopes with 2k planesany plane is orthogonal to one of k basic axes which remain fixed

Involve more rotamers more interaction to evaluate more combinations to enumerate

More realistic potentials longer interaction range

more edges in the graph less decomposable

Page 9: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Cubic (k = 3) Tetrahedral (k = 4)

k = 2 k = 3 k = 4

examples:

combinatorial complexity blows-up

hardly feasible even with new

algorithm

max max max

min min min

i i

i i

easy to enclose a ball easy to merge easy clash check almost rotatable

quick collision detection algorithm

… works best in conjunction with

k-Discrete Oriented Polytopes

– a class of convex polytopes with 2k planesany plane is orthogonal to one of k basic axes which remain fixed

Involve more rotamers more interaction to evaluate more combinations to enumerate

More realistic potentials longer interaction range

more edges in the graph less decomposable

Page 10: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Cubic (k = 3) Tetrahedral (k = 4)

k = 2 k = 3 k = 4

examples:

combinatorial complexity blows-up

hardly feasible even with new

algorithm

A doesn’t clash B if exists axis xi (1≤i ≤ k) such that

mm ain xx mia nmbi

bi

ai i

a or

easy to enclose a ball easy to merge easy clash check almost rotatable

k-DOP A k-DOP B

?

quick collision detection algorithm

… works best in conjunction with

k-Discrete Oriented Polytopes

– a class of convex polytopes with 2k planesany plane is orthogonal to one of k basic axes which remain fixed

Involve more rotamers more interaction to evaluate more combinations to enumerate

More realistic potentials longer interaction range

more edges in the graph less decomposable

Page 11: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

O H

n1e

2e 0e

More realistic potentials longer interaction range

more edges in the graph less decomposable

Fast anisotropic hydrogen bond potentialFast anisotropic hydrogen bond potential

optimaldc OH d ��������������

20 1 d H Ow z c c c ma

0maxx

11 1os osc cdz

0 maxc s, oHc n e 1 maxcos,Oc n e

0, 1( ) DefaultO HE w E w E 0 H OE q qB

optimal

x m xma a

2.1 0.65 30

45 cos 5o 3c s

dd B

Page 12: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

For more relevant comparison

it make sense to predict a crystal not the ASU

Amount of sidechains

relative surface accessibility (%)

Page 13: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

0

2

4

6

8

10

12

14

ALL ARG ASN ASP CYS GLN GLU HIS ILE LEU LYS MET PHE PRO SER THR TRP TYR VAL

All residuesCrystal contacts

Extra percent in average accuracy

due to crystal awareness

Knowledge of crystal symmetry enables higher accuracy …

Page 14: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Tuning parameters of the Flexible Rotamer ModelTuning parameters of the Flexible Rotamer Model

staticE

due to backbone and frame

pairwiseE

is ijp

sample around rotamer

library’s conformation

due to sidechains’ interaction

,i i i iic

may be setup independently for each type of amino acid

static logTself probabilitykE E

from rotamer library ic, ,k T

search for optimal values in high-

dimensional space optimize one amino acid type in a time (and loop for all)

static1

1log

nT

i

is

E T en

pairwise 1 21 1

1log

n mT

i j

ij i jp s s

E T e s smn

Page 15: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Optimizing expensive function in multidimensional space

, expensivenx f x

|| ||arg max

x Rf x

1. Generate sample of arguments and evaluate function at these points

2. Assume that second orderapproximation works well

3. From the linear regression resolve coefficients and their covariance

4. Maximization of quadratic form is relatively simple, provided that we can resolve eigenvalues and eigenvectors

5. Hence, generate sample of quadratic forms, maximize each of them and aggregate

robust for non-convex functions!

Page 16: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

86.9

87.4

87.9

88.4

88.9

89.4

0 10 20 30 40 50 60 70 80Iteration number

Ave

rage

con

ditio

nal a

ccur

acy

Testing @ H-bonds 1

Testing @ H-bonds 2

Training @ H-bonds 1

Traces through the optimization of the FRM parameters

Training on 40 proteins( ~ 2 500 residues )

+ 24 proteins more and continue ( ~ 5 000 residues )

Testing on 130 proteins ( ~ 20 000 residues )

Page 17: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Conditional accuracy (%)

Confidence of side-chain placement (derived from experimental EDS maps)

CYSASN ASPARG

ILE

GLN

GLU HIS

PHE

LEU

LYS

MET PRO SER THR

TRPALL

TYR VAL

sliding frame - 20%

1

3 1

2 1

2& &

&3

4

1

2

|

|

|

40

Measurement: Side chains with better electron density are easier to predict

Shapovalov et.al. 2007

Page 18: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

backbone PDB file

SCWRL3.exe

rotamer library

output PDB file

functionality of SCWRL4 is available as library

enables direct manipulation

of the model via C++ API

class SCWRL{ …};

SCWRL4.DLL

all this good with Improved usability (coming soon)

Page 19: Development of SCWRL4 for improved prediction of protein side-chain conformations In collaboration with Moscow Engineering & Physics Institute © George

Acknowledgements

Dr. Roland Dunbrack

Prof. Nickolai Kudryashov

Colleagues:

Adrian Canutescu

Guoli Wang

Maxim Shapovalov

Qiang Wang

Qifang Xu

Questions, Comments, Suggestions ?

Thanks for Your Attention! Have a Nice Day and welcome to our poster!