64
Protein multiple sequence alignment by hybrid bio- inspired algorithms Vincenzo Cutello, Giuseppe Nicosia*, Mario Pavone and Igor Prizzi Nucleic Acids Research, 2011 D00922025 黃黃黃 R00922102 黃黃黃 R00922156 黃黃黃 R99922158 黃黃黃 1

D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

  • Upload
    lea

  • View
    59

  • Download
    5

Embed Size (px)

DESCRIPTION

Protein multiple sequence alignment by hybrid bio-inspired algorithms Vincenzo Cutello, Giuseppe Nicosia*, Mario Pavone and Igor Prizzi Nucleic Acids Research, 2011. D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒. 1. Outline. Introduction & background IMSA - PowerPoint PPT Presentation

Citation preview

Page 1: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Protein multiple sequence alignment by hybrid bio-inspired algorithmsVincenzo Cutello, Giuseppe Nicosia*, Mario Pavone and Igor PrizziNucleic Acids Research, 2011

D00922025 黃任鋒R00922102 張庭耀R00922156 陳子筠R99922158 蘇宏麒

1

Page 2: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Outline

•Introduction & background

•IMSA

•Cloning and hypermutation operators

•Results

•Conclusion

Page 3: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Introduction and BackgroundD00922025 黃任鋒

3

Page 4: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

About this paper

4

Page 5: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Problem of MSA

5

Page 6: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Methods for MSA

6

Page 7: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Progressive alignments

7

Page 8: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Exact algorithms

8

Page 9: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Survey of MSA

9

Page 10: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Outline

•Introduction & background

•IMSA

•Cloning and hypermutation operators

•Results

•Conclusion

Page 11: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Immunological Multiple Sequence Alignment(IMSA)

R00922102 張庭耀

11

Page 12: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA

• Two different strategies to create the initial population

• New hypermutation operators

- solving protein MSA that insert or remove gaps

• Gap columns, which have been matched, are moved to the end of the sequence

• The remaining elements(i.e. amino acids) and existing gaps are shifted into the freed space

12

Page 13: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA

•Considers antigens (Ags) and B cells

- Ag is a given MSA instance, i.e. the protein sequences to align

- B cells are a population of alignments that have solved(or approximated) the initial problem

13

Page 14: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

14

Page 15: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

15

Page 16: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Initial population strategies

16

Page 17: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Random_initialization

17

Page 18: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Random_initialization

18

Page 19: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

CLUSTALW-seeding

19

Page 20: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Outline

•Introduction & background

•IMSA

•Cloning and hypermutation operators

•Results

•Conclusion

Page 21: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA-Cloning and hypermutation operatorsR00922156 陳子筠

21

Page 22: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Cloning and hypermutation operators

22

•Represented by “Static cloning operators”

•Clones B cells dup times

•P(clo) of Nc = d * dup B cells, d is population size

Page 23: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Cloning and hypermutation operators

23

Page 24: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

InsGap

24

Page 25: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

25

InsGap P(gap)

Page 26: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

26

RemGap P(gap)

Page 27: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

BlockShuffling operator

•Select randomly start point in a sequence

•BlockMove

•BlockSplitHor

•BlockSplitVer

27

Page 28: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

28

BlockMove P(block)

Page 29: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

29

BlockSplitHor P(block)

Page 30: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

30

BlockSplitVer P(block)

Page 31: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

STRIP_GAPS

31

Page 32: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Aging operator

32

•Eliminates old B cells in populations P(t), P(gap) and P(block)

•The generation number of B cell is τB

•New population P(t+1) of d B cells selected best survivors by (μ+λ) - selection

Page 33: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

33

Page 34: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Outline

•Introduction & background

•IMSA

•Cloning and hypermutation operators

•Results

•Conclusion

Page 35: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

ResultsR99922158 蘇宏麒

35

Page 36: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Classical Benchmark

•BAliBASE version 1.0, 2.0 and 3.0

- A benchmark alignment database .

- The evaluation of multiple sequence alignment.

36

Page 37: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

BAliBASE version 1.0

•141 reference alignments

•5 reference sets

37

Page 38: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

BAliBASE version 1.0, cont.

• Reference 1: equi-distant sequences with various levels of conservation

• Reference 2: family aligned with a highly divergent “orphan” sequence

• Reference 3: subgroups with < 25% residue identity between groups

• Reference 4: sequences with N/C-terminal extensions

• Reference 5: internal insertion

38

Page 39: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

BAliBASE version 2.0

•Include all alignments in version 1.0

•Alignments are verified and corrected

39

Page 40: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

BAliBASE version 3.0

•same as version 2.0

•contains 218 alignments

40

Page 41: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA - reference 1 lad2

41

Page 42: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA - reference 1 laym3

42

Page 43: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA - reference 1 1hfh

43

Page 44: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA - reference 1 2mhr

44

Page 45: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA - Reference 3 luky

45

Page 46: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA - Reference 5 1qpg

46

Page 47: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA - BAliBASE 1.0

47

Page 48: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA vs CLUSTALW-seeding - BAliBASE 1.0

48

Page 49: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA - BAliBASE 2.0

49

Page 50: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA vs CLUSTALW-seeding - BAliBASE 2.0

50

Page 51: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA vs AIS - BAliBASE 2.0

51

Page 52: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA vs ClonAlign - BAliBASE 2.0

52

Page 53: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA vs COBALT, PROBCONS, PCMA, MUSCLE, CLUSTALW - BAliBASE 3.0

53

Page 54: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA - BAliBASE 3.0 - SP

54

Page 55: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

IMSA - BAliBASE 3.0 - CS

55

Page 56: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Running time - BAliBASE 3.0

56

Page 57: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Outline

•Introduction & background

•IMSA

•Cloning and hypermutation operators

•Results

•Conclusion

Page 58: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Final Remarks

•Clonal Selection Algorithm

•IMSA

•IMSA

•CLUSTALW-seeding

•Two specific ad-hoc mutation operators

•Generating more than a single suboptimal alignment, for every MSA instance.

58

Page 59: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Final Remarks, cont.

•BAliBASE 1.0

•IMSA is superior to PRRP, CLUSTALX, SAGA, DIALIGN, PIMA, MULTIALIGN and PILEUP 8.

59

Page 60: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Final Remarks, cont.

•BAliBASE 2.0

•high SP, low CS

•future work - improvement of the CS score

60

Page 61: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Final Remarks, cont.

•BAliBASE 2.0

•IMSA shows best performance, and hence best alignments, than both ClonAlign and AIS.

61

Page 62: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Final Remarks, cont.

•BAliBASE 3.0 - new testbed

•compare with state-of-the-art alignment algorithms, IMSA also shows good alignments.

62

Page 63: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Reference

•http://bips.u-strasbg.fr/fr/Products/Databases/BAliBASE2/

63

Page 64: D00922025 黃任鋒 R00922102 張庭耀 R00922156 陳子筠 R99922158 蘇宏麒

Thank you .D00922025 黃任鋒R00922102 張庭耀R00922156 陳子筠R99922158 蘇宏麒

64