45
Computational method on biochemistry 정정정

Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Embed Size (px)

Citation preview

Page 1: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Computational method on biochemistry

정진원

Page 2: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

순서• Protein Structure and Dynamics• Bioinformatics• Comparative modeling• Other method

Page 3: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Protein structure and dynamics

• Time scale in biological phenomena• Newtonian mechanics• Force field

• CHARMM• AMBER

• Energy minimization• Molecular Dynamics• Example

Page 4: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Time scale in biological phenomena

ps ns s ms s ~hrfs

-15

Page 5: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method
Page 6: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Force field

• 주어진 분자에서 각 원자들의 좌표 - 위치로부터 에너지를 정의 .

• 이 값은 분자의 상태를 모사하기 위해 수치화한 것이므로 실제 현상에서의 에너지와는 직접적인 관계는 없음 .

Page 7: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Newtonian mechanics

• F=ma• v=v0+at=f(t)• s=v0t+at2/2=g(t)• E=mv2/2

힘이 존재하고 시간이 흐르면 물체의 위치와 속도 , 에너지는 변한다

Page 8: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method
Page 9: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method
Page 10: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method
Page 11: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Energy minimization

Page 12: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Energy minimization

구조를 최적화 !!

Page 13: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Molecular Dynamics

Page 14: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Molecular Dynamics

• Etot=Epot+Ekin

Page 15: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

CHemistry at HARvard Macromolecular Mechanics• CHARMm forcefields• CHARMm, which derives from CHARMM (CHemistry at HARvard Macromol

ecular Mechanics), is a highly flexible molecular mechanics and dynamics program originally developed in the laboratory of Dr. Martin Karplus at Harvard University. It was parameterized on the basis of ab initio energies and geometries of small organic models.

• Applicability • CHARMm performs well over a broad range of calculations and simulatio

ns, including calculation of geometries, interaction and conformation energies, local minima, barriers to rotation, time-dependent dynamic behavior, free energy, and vibrational frequencies (Momany & Rone, 1992). CHARMm is designed to give good (but not necessarily "the best") results for a wide variety of modelled systems, from isolated small molecules to solvated complexes of large biological macromolecules; however, it is not applicable to organometallic complexes.

Page 16: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Assisted Model Building with Energy Refinement• AMBER forcefield• The standard AMBER forcefield (Weiner et al. 1984, 1986) is parameteri

zed to small organic constituents of proteins and nucleic acids. Only experimental data were used in parameterization.

• However, AMBER has been widely used not only for proteins and DNA, but also for many other classes of models, such as polymers and small molecules. For the latter classes of models, various authors have added parameters and extended AMBER in other ways to suit their calculations. The AMBER forcefield has also been made specifically applicable to polysaccharides (Homans 1990, and see Homans' carbohydrate forcefield).

• AMBER is used mainly for modeling proteins and nucleic acids. It is generally lower in accuracy and has a limited range of applicability. The use of AMBER is recommended mainly for those customers who are familiar with AMBER and have developed their own AMBER-specific parameters. It generally gives reasonable results for gas-phase model geometries, conformational energies, vibrational frequencies, and solvation free energies.

Page 17: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Application

• protein motion• protein folding• enzyme mechanism• model optimization

Page 18: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

In silico protein folding

1us=1,000,000,000 fs(or step)

644 step/sec on 256 CPUs CRAY machine

Page 19: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Simulation of the travel of potassium

Page 20: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method
Page 21: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Bioinformatics

• Introduction• Sequence alignment

• Pairwise sequence alignment• BLAST

• Multiple sequence alignment• CLUSTALW• T-COFFEE

• Scoring matrix• Structure Alignment• Example

Page 22: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Pairwise alignment

• Smith-Waterman Algorithm• BLAST – local alignment• FASTA – global alignment

Page 23: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Align S1=ATCTCGTATGATGATCTCGTATGATG S2=GTCTATCACGTCTATCAC

GTCTATCAC

A T C T C G T A T G A T G

0 0 0 0 0 2 1 0 0 2 1 00000000000

0 0 0 0 0 0 0 0 0 0 0 0 02

0 2 1 2 1 1 4 3 2 1 1 3 20021021

1224321

4323654

3654554

4554657

3444556

3546545

3475576

2569876

1458876

03677

109

2258799

2147788

108

97

534

2

0

else 1

)( if 2),(

yxyxSbt

=1, =1

A T C T C G T A T G A T GA T C T C G T A T G A T G

G T C G T C T A T C A CT A T C A C

)2,1()1,1(

1)1,(

1),1(

0

max),(

ji SSSbtjiH

jiH

jiHjiH

Smith-Waterman Algorithm

Page 24: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

BLAST

• Basic Local Alignment Search Tool• Altschul, S.F., Gish, W., Miller, W.,

Myers, E.W. & Lipman, D.J.Journal of Molecular Biologyv. 215, 1990, pp. 403-410

• Used to search sequence databases for local alignments to a query

Page 25: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

BLAST algorithm

• Keyword search of all words of length w from the in the query of length n in database of length m with score above threshold• w = 11 for nucleotide queries, 3 for

proteins• Do local alignment extension for each

found keyword• Extend result until longest match above

threshold is achieved• Running time O(nm)

Page 26: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

BLAST algorithm (cont’d)

Query: 22 VLRDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLK 60 +++DN +G + IR L G+K I+ L+ E+ RG++KSbjct: 226 IIKDNGRGFSGKQIRNLNYGIGLKVIADLV-EKHRGIIK 263

Query: KRHRKVLRDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKIFLENVIRD

keyword

GVK 18GAK 16GIK 16GGK 14GLK 13GNK 12GRK 11GEK 11GDK 11

neighborhoodscore threshold

(T = 13)

Neighborhoodwords

High-scoring Pair (HSP)

extension

Page 27: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Original BLAST

• Dictionary• All words of length w

• Alignment• Ungapped extensions until score falls below

some threshold• Output

• All local alignments with score > statistical threshold

Page 28: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Original BLAST: ExampleA C G A A G T A A G G T C C A G T

C

T

G

A

T

C C

T

G

G

A

T

T

G C

G

A• w = 4

• Exact keyword match of GGTC

• Extend diagonals with mismatches until score is under 50%

• Output resultGTAAGGTCCGTTAGGTCCFrom lectures by Serafim Batzoglou

(Stanford)

Page 29: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

ClustalW

• Popular multiple alignment tool today• Several heuristics to improve accuracy:

• Sequences are weighted by relatedness

• Scoring matrix can be chosen “on the fly”

• Position-specific gap penalties

Page 30: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

ClustalW (cont’d)

• Often used for protein alignment• ‘W’ stands for ‘weighted’

• Different parts of alignment are weighted.• Position/residue specific gap penalties.

• Three-step process1.) Pairwise alignment2.) Build Guide Tree3.) Progressive Alignment

Page 31: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Step 1: Pairwise Alignment

• Aligns each sequence again each other giving a distance matrix

• Distance = exact matches / sequence length (percent identity)

S1 S2 S3 S4

S1 -S2 .17 -S3 .87 .28 -S4 .59 .33 .62 -

(.17 means 17 % identical)

Page 32: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Step 2: Guide Tree

• Create Guide Tree using the distance matrix

• ClustalW uses the neighbor-joining method

• Guide tree roughly reflects evolutionary relations

Page 33: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Step 2: Guide Tree (cont’d)

S1 S2 S3 S4

S1 -S2 .17 -S3 .87 .28 -S4 .59 .43 .62 -

S1

S3

S4 S2

Calculate:s1,3 = consensus(s1, s3)s1,3,4 = consensus((s1,3),s4)s1,2,3,4 = consensus((s1,3,4),s2)

Page 34: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Step 3: Progressive Alignment• Align the two most similar sequences• Following the guide tree, add in the next

sequences, aligning to the existing alignment

• Insert gaps as necessarySample output:

FOS_RAT PEEMSVTS-LDLTGGLPEATTPESEEAFTLPLLNDPEPK-PSLEPVKNISNMELKAEPFDFOS_MOUSE PEEMSVAS-LDLTGGLPEASTPESEEAFTLPLLNDPEPK-PSLEPVKSISNVELKAEPFDFOS_CHICK SEELAAATALDLG----APSPAAAEEAFALPLMTEAPPAVPPKEPSG--SGLELKAEPFDFOSB_MOUSE PGPGPLAEVRDLPG-----STSAKEDGFGWLLPPPPPPP-----------------LPFQFOSB_HUMAN PGPGPLAEVRDLPG-----SAPAKEDGFSWLLPPPPPPP-----------------LPFQ . . : ** . :.. *:.* * . * **:

Dots and stars show how well-conserved a column is.

Page 35: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Scoring Matrix

• BLOSUM• PAM• PSSM

Page 36: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

PAM• Percentage of Acceptable point Mutations per 108 years• 어떤 아미노산이 임의의 아미노산으로 바뀔 수 있는 확률을

바탕으로 score 설정 • matrices are based on global alignments of closely related protein

s. The PAM 1 is the matrix calculated from comparisons of sequences with no more than 1% divergence. Scores are derived from a mutation probability matrix where each element gives the probability of the amino acid in column X mutating to the amino acid in row Y after a particular evolutionary time, for example after 1 PAM, or 1% divergence. A PAM matrix is specific for a particular evolutionary distance, but may be used to generate matrices for greater evolutionary distances by multiplying it repeatedly by itself. However, at large evolutionary distances the information present in the matrix is essentially degenerated. It is rare that a PAM matrix would be used for an evolutionary distance any greater than 256 PAMs.

Page 37: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

BLOSUM

• Local alingment 에 사용하기 위해 개발• BLOcks SUbstitution Matrix• 일정정도의 유사한 서열들을 모아 정렬하고

그 안에서 치환되는 정도를 이용해서 scoring matrix 작성

• BLOSUM 62 는 유사성 62% 이상의 서열들을 모아서 작성한 것

Page 38: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Position Specific Scoring Matrix• 유사한 단백질간의 서열 정렬결과를 바탕으로 특성 아미노산이

특정 위치에 나타나는지의 여부를 점수화• PSI-BLAST 에서 사용하는 방법• 특징적인 서열이나 잔기를 가지는 단백질에 대한 전역탐색에

적절

Page 39: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Homology/Comparative modeling• Introduction• Method• Example

Page 40: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Introduction

• 유사한 기능을 지닌 단백질은 유사한 구조를 가지고 있음 .• Ex) hemoglobin/myoglobin, ubiquitin/ubiquitin like

proteins. Serine proteases, thioredoxin/glutaredoxin

Page 41: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Method

1. 30% 이상의 homology 를 가진 단백질 중 구조가 있는 것 검색

2. Pairwise or multiple sequence alignment3. Alignment 를 기준으로 구조를 따오거나 dis

tance constraint 작성 .4. Model 최적화

Page 42: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Example: Modeling of malonly-CoA synthetase

Page 43: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Malonyl-CoA synthetase Firefly luciferase

Page 44: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method

Other Methods

• Simulated Annealing• Monte Carlos method• Docking

Page 45: Computational method on biochemistry 정진원. 순서 Protein Structure and Dynamics Bioinformatics Comparative modeling Other method