Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Dynamics
Eran Eyal May 2011
Dynamics of proteins – what can we learn from experiments?
• X-ray crystallography
• NMR
TYROSINE-PROTEIN KINASE color by B-factor
B-factor
A measure of the uncertainty in the position of individual atoms
Anisotropic Displacement Parameters (ADP) from X-ray crystallography
Anisotropic Fluctuations ΔX ≠ ΔY ≠ ΔZ
Isotropic displacements ΔX = ΔY = ΔZ
<(ΔX)2> = <(ΔY)2> = <(ΔZ)2> = <(ΔR)2>/3
Mean-square fluctuations of residues about their mean positions
<(ΔR)2> = <(ΔX)2> + <(ΔY)2> + <(ΔZ)2>
ΔX
ΔY
ΔZ
Anisotropic Displacement Parameters (ADP) - background
0.15240.03190.0291
0.03190.2649-0.0422
0.0291-0.04220.1672
X Y Z Occ B-factor
ATOM 16 CA THR A 2 0.708 -5.416 -12.414 1.00 6.25ANISOU 16 CA THR A 2 16172 26490 15240 -4220 2910 3190
U11 U22 U33 U12 U13 U23
U =
<(ΔZ)2><ΔZΔY ><ΔZΔX><ΔYΔZ ><(ΔY)2><ΔYΔX><ΔXΔZ ><ΔXΔY ><(ΔX)2>
ΔX
ΔY
ΔZCovariance matrix of each residue
Dynamics from MNR
Dynamics from MNR
http://ignm.ccbb.pitt.edu/oPCA_Online.htm
Dynamics of proteins – computational approaches
• Dynamics of proteins is clearly related to their function.
• Understanding the relation between the two is a main challenge in the field of biophysics
• Molecular Dynamics provides a way to conduct non-equilibrium simulations but only for short time scales (10-7 s)
• Normal Mode Analysis provides a way to analyze equilibrium motion for longer time scales
Times and Amplitude scales
Functionality examples
Type of motion
ms - h (10-3 - 104 s) more than 10 Å
•Hormone activation •Protein functionality
Global Motions:•Heix-coil transition •Folding/unfolding •Subunit association
μs - ms (10-6 - 10-3 s) 5 - 10 Å
•Hinge bending motion •Allosteric transitions
Large Scale Motions:•Domain motion •Subunit motion
ns - μs (10-9 - 10-6 s) 1 - 5 Å
•Active site conformation adaptation •Binding specificity
Medium Scale Motions:•Loop motion •Terminal-arm motion •Rigid-body motion (helices)
fs - ps (10-15 - 10-12 s) less than 1 Å
•Ligand docking flexibility •Temporal diffusion pathways
Local Motions:•Atomic fluctuation •Side chain motion
Modified after: Becker & Watanabe (2001). Dynamic Methods. In Computational & Biochemistry & Biophysics (Edited by Becker et al.)
Source: http://www.lofar.org/BlueGene/Suits.pdf
Molecular Dynamics (MD)
• MD: Movement simulation of the motion of all particles in a molecular system by iteratively solving Newton’s equations of motion.
• Based on Newton’s classical mechanics: F=MA
• Evaluation of energies and forces of interaction of all particles in the system.
• MD provides links between structure and dynamics by enabling the exploration of the conformational energy landscape accessible to macromolecules.
Newton’s equation of motion is given by:
where Fi is the force exerted on particle i, mi is the mass of particle i and ai is the acceleration of particle i.The force can also be expressed as the gradient of the potential energy:
iii amF =
iii VF −∇=
Newton’s equation of motion
Combining these 2 equations yields:
where V is the potential energy of the system. Newton’s equation of motion can then relate the derivative of the potential energy to the changes in position as a function of time.
2
2
dtrdm
drdV i
ii=−
• 1977: 1st protein MD (McCammon, Gelin, Karplus). 9.2 ps, vacuum.
• Today’s ‘regular’ MD: 10-100 ns, explicit solvent.• IBM’s Blue-Gene aim is MD –longer / more accurate runs.
MD: historical prespective
Quality of MD runs depends on:
• Time-scale and resolution of investigated motion.• Availability of good force-field parameterization (partial charges) for required data, e.g. Residue Topology File (RTF) for cofactors (transferability vs. refined accuracy)• Quality of initial data• Details taken into account (solvent, long-range electrostatics, constraints on the system etc.)
1. Hydrogens – A. freeze the fastest modes of vibration by constraining the bonds to hydrogen atoms to a fixed length (SHAKE or RATTLE algorithms). B. treat the water solvent (~half of all simulated atoms) as a rigid body (if using explicit solvent). C. consider simulating without non-polar hydrogens by making larger ‘pseudo-atoms’ (‘united atoms’ force-field).
2. Integrators – Solving Newton’s equation requires a numerical procedure for integrating the 2nd order differential equation. A standard procedure is the finite-difference approach. The coordinates and velocities at time t+Δt are obtained to a sufficient degree of accuracy) from the molecular coordinates and velocities at an earlier time t. The equation is solved for each time-step.
3. Long-range electrostatics – the slow decaying Coulombic interactions (r-1) and dipolar interactions (r-3) and even London dispersion forces (r-6) must be solved over long distances. Cut-offs are problematic as we must conserve energies and not introduce discontinuity in the system. Solutions include: Switch (problematic), PME – Particle Mesh Ewald, P3M – Particle-Particle, Particle-Mesh Ewald, Reaction Field (RF – treats nearby solvent explicitly and rest implicitly), Generalized RF. See description in slides below.
MD Bottlenecks Require Designated Solutions
• Integrator: Choose Δt to be ~1/20 of fastest measured motion.• Hydrogen vibration = fastest movement ~10 fs compute with
designated algorithm: use a 1 fs (10-15 s) time step for the system.• 1,000,000 steps / ns. 30k+ atoms in typical run. ~1,000 machine
instruction each step.• Enon-bonded component complexity: check use of united atoms, implicit
solvent, harmonic constraints for parts of the system.• Complementary, if question includes non-newtonian forces, consider
QM/MM or designated methods.
Molecular Mechanics (MM): Molecular Mechanics (MM): more considerationsmore considerations
MD steps
• Build system (solvent, cofactors, unique forces)• Run system (duration, temperature, only minimization or ‘real’
dynamics)• Analyze system
MMinimizationinimization
• The energy function surface (called energy landscape) of a macromolecule is composed of multiple local minima (stable states) and multiple saddle points (transition states).
• Minimization is a procedure to find a local minimum.
html.course/becker~/il.ac.tau.www://httpSource (old link): Source: http://www.ch.embnet.org/MD_tutorial/
MDMD procedure: cool, heat, cool, produceprocedure: cool, heat, cool, produce
MD Analysis
I)Mean Energy
II)RMS difference between two structures
III)RMS fluctuations
note the relation between the RMS fluctuations and the crystallographic B factors;
∑==
N
i iEE N 1
1
( ) ( )∑=−= −i
iii
rrNii rrRMS 212βαβα
( )∑= −f
averagei
fi
frrN
fluctiRMS 21
( )2238 fluct
ii RMSB π=
MD tricksMD tricks• How far to compute?• How much solvent do we need?• Till the solvent behave as bulk water…• Use periodic boundary conditions:
Translation of system to infinity by rigid translation of all atoms. Each particle interacts with all particles (not itself) in box and in images, but we store only main box. Once a particle leaves the cell, it is replaced by a particle coming from the opposite image.
• For cutoff distance R (VdW 7-10Å, Coulomb 10-15Å), use box > 2R• What will happen if we replace Zn++ by Ca++?• We can use computational alchemy – in each step change the weights between
the systems interaction with each metal, i.e. in the beginning use 95% Zn++ and 5% Ca++ that are computed in parallel. The Zn doesn’t see the Ca and vice versa. Finish by reversing the weights till 100% Ca++ .
Normal Mode Analysis
• A simple analytical tool to explore equilibrium dynamics of proteins
• Approximation of the potential at a local minima by harmonic function
• Introduced shortly after the MD introduction in the beginning of 80’s:
Go, Noguti and Nishikawa (1983) PNAS 80, 3693-3700Brooks and Karplus (1983) PNAS 80, 6571-6575Levitt, Sander and Stern (1985) JMB, 181, 423-427
Stern (1989) Prog Clin Biol Res 289, 87-94.
Elastic network models are a special type of normal mode analysis. The molecule is represented as network of nodes connected by springs.
Elastic Network Models
Representation of protein structure as an elastic network produces, using a single parameter, accurate and detailed description of the dynamics of the system
Tirion (1996), Phys Rev Lett, 77, 1905-1908
The most global dynamic features of the system are maintained even when the system is modeled at a more simplified (“coarse-grained”) level
Doruker, Atilgan and Bahar (2000), Proteins, 40, 512-524Tama & Sanejouand (2001), Protein Eng, 14, 1-6Atilgan et al (2001), Biophys J. 80, 505-515
d < rc
GNM (Gaussian Network Model) and ANM (Anisotropic Network Model)are residue level models widely used for investigating the dynamics of biological systems
Bahar, Atilgan and Erman (1997), Fold Des, 2, 173-181Hinsen (1998), Proteins, 33, 417-429Doruker, Atilgan and Bahar (2000), Proteins, 40, 512-524
Gaussian Network Model (GNM)
RR)(γ T0ijij
ijjij ΓΔΔ=−Γ= ∑
≠
2
||RR|
2V
Ti
N
ii
i
BTK uuC 1 ∑−
=
− =Γ≈1
1
1λγ
Potential:
Kirchhoff (N×N):
Covariance (N×N):
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−−−
−=Γ
1 1 0 1 2 10 1 1
TKBij
γ
λi are the eigenvalues of Γui are the eigenvectors of Γ
⎪⎭
⎪⎬⎫
⎪⎩
⎪⎨⎧
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛⎟⎟⎠
⎞⎜⎜⎝
⎛−∝
⎭⎬⎫
⎩⎨⎧−∝
−− RTkR
21
RRTk
Rp
BT
T
B
ΔΓΔ
ΓΔΔΔ
11exp
2exp)(
γ
γ
( ) ( )⎭⎬⎫
⎩⎨⎧ −−−= − μxΞμx
ΞΞμ,x, 1
21exp
)2(1)(
21
2
TN
Wπ
Relation between probability and potential (Boltzmann):
General form of multivariate probability distribution:
Anisotrpic Network Model (ANM)
2
|)R(R
2V 0
ijijijj
ij )(γ−Γ= ∑
≠
Ti
N
ii
i
uuHC 1 ∑−
=
− =≈63
1
1λ
d < rc
Potential:
Hessian (3N×3N):
Covariance (3N×3N):
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡Γ
=
ijijijijijij
ijijijijijij
ijijijijijij
20
Z ZY ZXZ
ZY YY XYZX YX XX
)(R ij
ijij
γH
…so what useful information do we get?
• Decomposition of the motion to independent modes. In each mode all particles move in the same phase and frequency
• Fluctuations of specific residues
• Correlation between residues
• Each mode has a unique frequency. Slowest modes facilitate moreextensive and cooperative conformation changes. In Macromolecules these changes are often related to the function
Konard Hinsen
“Normal Mode Theory and Harmonic Approximations”
The major advantages of elastic network models over MD simulations:
1. Insight to longer time scales2. Accurate solution 3. Fast computational time – large complexes, database scale analysis
This major disadvantages:
1. Approximate potential – harmonic approximation2. Theoretically, should be valid only around local minima
The motions along the slowest normal mode correlate well with large scale functional motions experimentally observed Intrinsic ability of proteins to undergo functional motions
r=0.87
Ovotransferrin
Correlation between normal modes and large scale conformation change
HIV-1 reverse transcriptase
actin
maltodextrin-binding protein
glutamate-binding protein
LAO
LIR-1
Calmodulin (CaM)
P38 kinase
Yang and Bahar (2005) Structure 13, 893-904
phospholipase ricin
Relation between motion and catalytic function
Slow modes
TheoryBiological motivation
Evaluation of the modelsDynamics of molecules in crystalsAnistropy of mechanical unfolding
Dynamics and EvolutionPhosphorylation sites
Available tools
Residue
SS SY SSSTS SSS SY TT T S YT
Ovocleidin (Egg shell protein) 1gz2, P83515
Red- phosphorylation siteGreen – not known to be phosphorylated
Fluctuations in the slow mode of phosphorylation sites vs non phsphorylation sites
Anisotropic Network Model (ANM)
Ti
N
ii
i
uuHC 1 ∑−
=
− ==63
1
1λ
Covariance (3N×3N):
Covariance of each node: 1H −
ii ADP matrix foreach residue
Doruker et al. (2000) Proteins 40, 512-524Atilgan et al. (2001) Biophys J. 80, 505-515
ExperimentalTheoretical
Blue color indicates small fluctuations
Antifungal protein EAFP2 from Oliver tree
Eyal et al. (2007) Bioinformatics 23, i175-184.
How to compare sets of ADPs ?
ab Tbk aki ak bl
k k lak bl
3 1 d 1 dD ln v v2 2 d 2 d= = =
= − + +∑ ∑∑3 3 3 2
1 1 1
• Correlation coefficient between the ADP values
variance at the short axisvariance at the long axis• Anisotropy (A): 0 < A ≤ 1
• Volume (V): the ellipsoid volumes
• Kullback-Leibler distance (KLD). Overlap index. Considers shape and orientation of ellipsoids a, b (defined by their 3×3 ADP matrices). It can be expressed using their eigenvalues (d) and eigenvectors (v) of the individual ADP matrices:
KLDab
0 ≤ kld ≤ ∞
Test set for examination of experimental and theoretical ADPs
A set of 93 high resolution (R<1.5 Å) non-redundant proteins
0.4290.3430.4860.7710.5710.2600.547186Mean (93 proteins )
off-diagonaldiagonalAll ADP(ΔR)2Anisotropyvolumes KLDPearson correlation coefficient r between
NPDB structures
Correspondence between theory and experiments
Correlation coefficient of directional fluctuations and overall fluctuations at EAFP2
How similar are the experimental data reported forthe same protein in different PDB files?
A closer look at experimental data
The contribution of the rigid body motion to the experimental fluctuations is smaller at denser crystals
0.1540.6580.7930.9160.8700.4640.867195Mean (19 pairs)0.2300.6060.7520.8720.8720.0840.8551751kt7A1kt5A0.1570.6870.7990.9180.8970.3510.8962071bs9A1g66A0.1660.5250.7310.8700.8450.3450.8991294lztA3lztA0.0430.8020.7970.9300.8280.8600.774581k6uA1g6xA0.0660.7540.8720.9620.9000.7980.881901m1qA1m1rA0.2300.6720.8270.9120.9300.3470.9263591q2qA1rgzA0.1210.7110.8880.9570.9490.4780.9392081me3A1me4A0.1200.8370.9030.9280.9330.8430.9213151z8aA1pwmA0.1340.4820.6130.8580.6380.5320.6641511bzpA1a6mA0.1140.6150.7840.9410.8120.4430.8182991i1xA1i1wA0.0700.7790.9480.9620.9630.8960.9901641swzA1sx7A0.0580.6640.9140.9700.9390.0070.9333641oc6A1oc7A0.1290.5340.6380.9330.8060.3670.8062631nymA1m40A0.2000.4670.7190.8460.7950.5350.8322911rtqA1lokA0.2220.6920.7780.9020.8400.6130.8471301oq5A1lugA0.3080.6220.7710.8950.8650.4000.8641581q0nA1f9yA0.1350.8760.8560.9370.9240.7100.9001831kmsA1kmvA0.1830.4740.6380.9020.8300.1760.801511gdnA1pq7A0.2310.7170.8570.9250.9510.0620.9301201uwnX1nwcA
off-diagonaldiagonalAll ADP(ΔR)2Anisotropyvolumes KLDPearson correlation coefficient r between
NPDB structures
0.4120.4170.4800.7970.5690.2400.544173Mean (8 pairs)0.2560.4260.3390.7160.3560.3080.295811ir0A1iqzA0.3170.2810.5060.8000.5600.1280.5871291ieeA3lztA0.3920.5400.6560.8360.7560.0420.7473151t41A1pwmA0.5400.3860.3810.7990.4360.3260.4481511u7sA1a6mA0.1630.6750.7970.9200.8530.5060.8542222a70B2a6zA0.9680.1830.2530.6760.3720.1090.2191511rb0A1f9yA0.3760.3660.4330.8170.5570.2060.5222231ppzA1pq7A0.2870.4820.4800.8120.6440.2940.6821191kouA1nwzA
sam
e cr
ysta
l for
mdi
ff. c
ryst
al
0.4290.3430.4860.7710.5710.2600.547186Mean (93 proteins)
anm
The levels of agreement between
the different crystal forms of the same protein
are comparable to those between
theoretical and experimental predictions
4lzt 3lzt ANM (3lzt)1iee
Hen egg lysozyme
SG: P1 SG: P1SG: P43 21
KLD
B
ette
r agr
eem
ent
Refining the model parameterrc = 10 Å
rc = 12 Å
rc = 15 Å
rc = 18 Å
d < rc
experimentalHistograms of anisotropy values
Anisotropy in mechanical resistance to unfolding
• Processes in living cells depend on the mechanical properties of bio-molecules. Many proteins need resistance to mechanical pressures to fulfill their function, for example within muscle fibers and in the cytoskeleton.
• Recent advances in single-molecule atomic techniques such as force microscopy (AFM) and optical tweezers, allow to examine the response of proteins to well-oriented tensions.
• Unfolding forces in different pulling directions have been measured for green fluorescent protein (GFP), ubiquitin and the lipoyl domain (E2lip3) of acetyl transferase subunit E2p.
• Significant differences have been observed in the responses of the very same molecule to pulling along different directions. Dietz et al. (2006) PNAS 103, 12724–12728
Eyal and Bahar (2008) Biophys j, in press
|||.
cos (k)ij
(0)ij
(k)ij
(0)ij(k)
ij RR|RR
aΔ
Δ≡
Normalized contribution of each mode:
Weighted contribution of each mode:
Effective spring constant for the system
Mechanical resistance of GFP
The contributions of different modes distribute differently in the different directions.
Unfolding along the barrel axis requires weaker forces
Complete mechanical resistance map of GFP
250.41-80
1771.251-41
Exp (pN)
Calc(N/m)
Mechanical resistance of E2lip3
Elastic Network Models for predicting of mechanical resistance
• ENM emerges as a promising tool to estimate anisotropic mechanical resistance of globular proteins
• The method is very efficient and can be used to scan many pulling directions and direct experimental setup.
How do ENM predict path dependent events?
•The theoretical scope of ANM is only “close enough”to local minima
• How can ANM provide prediction to large scale conformation changes and unfolding paths?
• The topology of the folded state contains much of the information required to determine alternative states and transition paths.
http://ignmtest.ccbb.pitt.edu/anm