View
282
Download
5
Embed Size (px)
Citation preview
Aims
• Understand the use of algorithms
• Recognize different approaches
• Understand the limitations
Objectives
• Predict occurrence of aspects of structure
• To select appropriate tools
1 primary
• Amino acid sequence
NH2-MRLSWYDPDFQARLTRSNSKCQGQLEV YLKDGWHMVC SQSWGRSSKQWEDPSQASKVCQRLNCGVPLSLGPFLVTYTPQSSIICYGQLGSFSNCSHSRNDMCHSLGLTCLE-COOH
The problem…..
• The best way is by X-ray crystallography or NMR etc…
• Structure databases only hold about 10,000 + structures
• Therefore devise programs to deduce structural solutions
• Complex!
Secondary Structure prediction
• Signal peptides
• Intracellular targeting
•Trans-membrane -helices
• -helices and -sheets
•Super-secondary structure (motifs)
Signal peptides
• Short N-terminal amino acid sequences
• Direct to membrane
• Cleaved after translocation
• SignalP – Nobel Prize 1999 Günter Blobel
Is the sequence a signal peptide?
# Measure Position Value Cutoff Conclusion max. C 25 0.910 0.37 YES max. Y 25 0.861 0.34 YES max. S 12 0.960 0.88 YES mean S 1-24 0.892 0.48 YES# Most likely cleavage site between pos. 24 and 25: SRA-LE
Intracellular targeting
• TargetP
• Predict subcellular location of eukaryotic protein
• Presequences – Chloroplasts– Mitochondria– signal peptide
Transmembrane Domains
• Lots of programs
• TMHMM -helices– hydrophobic – helix topology– R or K +ve charge cytoplasmic
side– Hidden Markov Modelling
-helices and -sheets
• GOR algorithim• Assigns each residue to one conformational state of -helix, extended chain, reverse turn or coil• 64.4% accurate• Many other sites
• most use multiple alignments
-helices and -sheets
10 20 30 40 50 60 70 | | | | | | |MKFSWRTALLWSLPLLVVGFFFWQGSFGGADANLGSNTANTRMTYGRFLEYVDAGRITSVDLYENGRTAIcccceeeeeecccceeeeeeeeccccccccccccccccccchhhhcceeeeccccceeeeeeccccceeeVQVSDPEVDRTLRSRVDLPTNAPELIARLRDSNIRLDSHPVRNNGMVWGFVGNLIFPVLLIASLFFLFRReeccccccchhhhccccccccchhhhhhhhhccccccccceecccceeeeecccccchhhhhhhhheeecSSNMPGGPGQAMNFGKSKARFQMDAKTGVMFDDVAGIDEAKEELQEVVTFLKQPERFTAVGAKIPKGVLLcccccccccchhhhcchhhhhhhhccceeeecchhhhhhhhhhhhhhhhhhcccchhhhhcccccceeeeVGPPGTGKTLLAKAIAGEAGVPFFSISGSEFVEMFVGVGASRVRDLFKKAKENAPCLIFIDEIDAVGRQRecccccchhhhhhhhhcccccceeecccccceeeeeecccchhhhhhhhhcccccceeeecchhhhccccGAGIGGGNDEREQTLNQLLTEMDGFEGNTGIIIIAATNRPDVLDSALMRPGRFDRQVMVDAPDYSGRKEIccccccccchhhhhhhhhhhhhcccccccceeeeeeccccchhhhhhccccccceeeeecccccccchhhLEVHARNKKLAPEVSIDSIARRTPGFSGADLANLLNEAAILTARRRKSAITLLEIDDAVDRVVAGMEGTPhhhhhhhhccccccchhhhccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhheeeccccccLVDSKSKRLIAYHEVGHAIVGTLLKDHDPVQKVTLIPRGQAQGLTWFTPNEEQGLTTKAQLMARIAGAMGcccccccchhhhhcccceeeeeecccccccceeeecccccccceeccccccccchhhhhhhhhhhhhhhhGRAAEEEVFGDDEVTTGAGGDLQQVTEMARQMVTRFGMSNLGPISLESSGGEVFLGGGLMNRSEYSEEVAhhhhhhhcccccceeeccccchhhhhhhhhhhhhhhccccccccccccccceeeecccccccccchhhhhTRIDAQVRQLAEQGHQMARKIVQEQREVVDRLVDLLIEKETIDGEEFRQIVAEYAEVPVKEQLIPQLhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccccchhhhhhhhhhcccccccccccc
Super-secondary Structure• Secondary structure elements
combined into specific geometric arrangements known as motifs
Beta corner
Super-secondary Structure
Several programs/websites for specific domains e.g.
• PAIRCOIL and MULTICOIL - detect coiled-coiled regions– regions separating domains
• TRESPASSER - detects Leucine Zippers– Leu-X6-Leu-X6-Leu-X6-Leu protein interaction domain
• NPS@nalysis Helix-Turn-Helix– Protein interaction/DNA binding
Integrated stucture prediction
• One stop shop!• Predict Protein at EBI
– secondary structure
– solvent accessibility globular regions
– transmembrane helices coiled-coil regions
– a multiple sequence alignment ProSite sequence motifs
– low-complexity retions
– ProDom domain assignments
Protein sequence(primary structure)
Database searchingfor homologues
Homologue ofknown structure
No homologue ofknown structure
Comparativemodelling
3D-structure
Fold prediction,ab initio methods etc.
Homology Modelling
• Method of choice following BLAST search
• SWISSModel is agood WWWInterface
URL: http://www.expasy.ch/swissmod/SWISS-MODEL.html
• Requires at least one sequence of known 3D-structure with significant similarity to the target sequence.
• Compare the target sequence with database - FastA and BLAST.
• Sequences with a FastA score 10.0 standard deviations above the mean of the random scores or a P(N) lower than 10-5 (BLAST) considered for the model building
• Restrict to those which share at least 30% residue identity
Homology Modelling
Homology Modelling
• Framework construction– compare atom positions - Cs
• Build non-conserved loops
• Complete backbone - add other atoms
• Add side chains
• Refine
What if I have no homologue?
Ab initio methods - Threading
• Sequence of unknown structure
• Thread through a through a sequence of known structure
• Move query sequence through residue by resudue and compare computationally
– include thermodynamic criteria, solvent accessibility, secondary structure information
• Computing intensive