38
DNA +Pro : Alignment of Combined DNA- Protein Sequences for evolutionary analysis of virus genes and genomes Xiaolong Wang Ocean University of China Email: [email protected] Website: www.DNAPlusPro.com 青青青青青青青 2010 青 12 青 2 青

2001 年 2 月 12 日美、英、日、法、德、中六国科学家,以及 Celera 公司联合公布人类基因组图谱,及初步分析 结果

  • Upload
    belva

  • View
    275

  • Download
    17

Embed Size (px)

DESCRIPTION

DNA +Pro : Alignment of Combined DNA-Protein Sequences for evolutionary analysis of virus genes and genomes Xiaolong Wang Ocean University of China Email: [email protected] Website: www.DNAPlusPro.com 青岛市眼科医院 2010 年 12 月 2 日. - PowerPoint PPT Presentation

Citation preview

DNA+Pro: Alignment of Combined DNA-Protein Sequences for evolutionary analysis of virus genes and genomes

• Xiaolong Wang• Ocean University of China• Email: [email protected] • Website: www.DNAPlusPro.com

青岛市眼科医院 2010 年 12 月 2 日

• 2001 年 2 月 12 日美、英、日、法、德、中六国科学家,以及 Celera 公司联合公布人类基因组图谱,及初步分析结果

• NATURE | VOL 409 | 15 FEBRUARY 2001 |• SCIENCE| Vol 291 | 16 FEBRUARY 2001

Nature 和 Science 2001 年 2 月 15 日和 16日人类基因组专刊封面。 Science 封面中的五位成年人分别为 Celera 公司人类基因组测序计划基因材料的提供者。

2005 年,在《 Nature 》上,Margulies 等人发表文章介绍了一种快速简单的测序方法:结合了 DNA 扩增的乳胶系统(emulsion system) 和皮升级焦磷酸 (pyrophosphate) 为基础的测序方法——焦磷酸测序(pyrosequencing) 方法。同年年底,研究人员将这种崭新的测序技术转化成了商品化的仪器—— 454 Genome Sequencer系统,由此拉开了快速基因组测序时代的序幕。

ArabidopsisEscherichia coli

Buchnerasp. APS

Rickettsia prowazekii

Ureaplasma urealyticum Bacillus

subtilisDrosophila melanogaster

Thermoplasma acidophilum

Plasmodium falciparum

MouseRat

Caenorhabitis elegans

rat

Borrelia burgorferi

Aquifex aeolicus

Neisseria meningitidis Z2491

Mycobacterium tuberculosis

Borrelia burgorferi

Thermotoga maritima

Helicobacter pylori

已完成基因组测序的物种(部分)

human

生物信息学的必要性 首先伴随着基因组研究,相关信息出现了爆炸

性增长,迫切需要对海量生物信息进行处理。 文献的增长 生物数据的增长 基因组研究需要依赖生物信息学

生物信息学家们面对的是堆集如山的 DNA 片段

约 600 万年前开始,源自同一个祖先,人类和黑猩猩走上了不同的进化道路。 600万年后的今天,科学家们另辟蹊径,通过对人类的亲戚———黑猩猩的基因组序列分析,并将其与人类的基因组序列相比较,来解答人类起源和进化过程中的问题。

From the Cell to Protein Machines

• Protein inhibitors ( Virus as an example)–attachment, entry and fusion inhibitors –DNA polymerase inhibitors – integrase inhibitors – interferons –maturation inhibitors –monoclonal antibodies –neuraminidase inhibitors –NS3 protease inhibitors –nucleoside reverse transcriptase inhibitors –protease inhibitors – reverse transcriptase inhibitors –RNA polymerase inhibitors

•Nucleic acid inhibitors(Antisense oligonucleotides or RNAi)–Targeting mRNA –Targeting microRNA–Targeting genomic DNA–Interfere mRNA processing–Aptamers oligonucleotide or peptide

molecules that bind to a specific target molecule

From Finding Homologs to drug design

Use ClustalW to do a progressive MSA

http://www2.ebi.ac.uk/clustalw/

Use ClustalW to do a progressive MSA

http://www.clustal.org/

Use ClustalW or ClustalX to do a progressive MSA

http://www.clustal.org/

ClustalW

Praline

MUSCLE

Probcons

TCoffee

MAFFT

ProbCons

Praline

MUSCLE

CLUSTAL

TCOFFEE

DNA+Pro: Alignment of Combined DNA-Protein Sequences for evolutionary analysis of virus genes and genomes

HV1J3 HV1OY HV1B1 HV1C4 HV1A2 HV1RH HV1EL HV1ND HV1Z84 HV1MA HV1ZH SIVCZ HV2BE HV2D1 HV2G1 HV2NZ HV2CA HV2D2

5499

5499

100

8497

100

84

83

5599

27

17

39

HV1J3 HV1OY HV1B1 HV1C4 HV1A2 HV1RH HV1EL HV1ND HV1Z84 HV1MA HV1ZH SIVCZ HV2BE HV2D1 HV2G1 HV2NZ HV2CA HV2D2

5499

5499

100

8497

100

84

83

5599

27

17

39

HV1J3 HV1OY HV1B1 HV1C4 HV1A2 HV1RH HV1EL HV1ND HV1Z84 HV1MA HV1ZH SIVCZ HV2BE HV2D1 HV2G1 HV2NZ HV2CA HV2D2

5499

5499

100

8497

100

84

83

5599

27

17

39

218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238<HV1J3> I N N S T K D N I K N - - - - D N S T R Y<HV1B1> I D N - - - - - - - - - - - - - D T T S Y<HV1C4> I D D N K N T - - - - - - - - T N N T K Y<HV1A2> I D N A S T T - - - - - - - - T N Y T N Y<HV1OY> I D - - - - - - - - - - - - - K N D T K F<HV1RH> I E K G N I S P K N N T S N N T S Y G N Y<HV1ND> I D N N N - - - - - - - - - R T N S T N Y<HV1EL> I D N D S - - - - - - - - - S T N S T N Y<HV1Z84> I D D D N S A N T S - - - - N T N Y T N Y<HV1MA> I D D S D - - - - - - - - - - - - N S S Y<HV1ZH> I G G N S S N - - - - - - - - G D S S K Y<SIVCZ> L G N E N - - - - - - - - - - - - - N T Y<HV1J3> ata aat aat agt acc aag gat aat ata aaa aat --- --- --- --- gat aat agt acc aga tat<HV1B1> ata gat aat --- --- --- --- --- --- --- --- --- --- --- --- --- gat act acc agc tat<HV1C4> ata gat gat aat aaa aat act --- --- --- --- --- --- --- --- acc aac aac acc aaa tat<HV1A2> ata gat aat gct agt act act --- --- --- --- --- --- --- --- acc aac tat acc aac tat<HV1OY> ata gat --- --- --- --- --- --- --- --- --- --- --- --- --- aag aat gat act aaa ttt<HV1RH> ata gag aag ggt aat att agc cct aag aat aat act agc aat aat act agc tat ggt aac tat<HV1ND> ata gac aat aat aat --- --- --- --- --- --- --- --- --- agg acc aat agt act aat tat<HV1EL> ata gac aat gat agt --- --- --- --- --- --- --- --- --- agt acc aat agt acc aat tat<HV1Z84> ata gat gat gat aat agt gct aat acc agt --- --- --- --- aat acc aat tat acc aat tat<HV1MA> ata gat gat agt gat --- --- --- --- --- --- --- --- --- --- --- --- aat agt agt tat<HV1ZH> att ggg gga aat agt agt aat --- --- --- --- --- --- --- --- ggt gat agt agt aaa tat<SIVCZ> cta ggg aat gag aac --- --- --- --- --- --- --- --- --- --- --- --- --- aac aca tat<HV1J3> ataI aatN aatN agtS accT aagK gatD aatN ataI aaaK aatN ---- ---- ---- ---- gatD aatN agtS accT agaR tatY<HV1B1> ataI gatD aatN ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- gatD actT accT agcS tatY<HV1C4> ataI gatD gatD aatN aaaK aatN actT ---- ---- ---- ---- ---- ---- ---- ---- accT aacN aacN accT aaaK tatY<HV1A2> ataI gatD aatN gctA agtS actT actT ---- ---- ---- ---- ---- ---- ---- ---- accT aacN tatY accT aacN tatY<HV1OY> ataI gatD ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- aagK aatN gatD actT aaaK tttF<HV1RH> ataI gagE aagK ggtG aatN attI agcS cctP aagK aatN aatN actT agcS aatN aatN actT agcS tatY ggtG aacN tatY<HV1ND> ataI gacD aatN aatN aatN ---- ---- ---- ---- ---- ---- ---- ---- ---- aggR accT aatN agtS actT aatN tatY<HV1EL> ataI gacD aatN gatD agtS ---- ---- ---- ---- ---- ---- ---- ---- ---- agtS accT aatN agtS accT aatN tatY<HV1Z84> ataI gatD gatD gatD aatN agtS gctA aatN accT agtS ---- ---- ---- ---- aatN accT aatN tatY accT aatN tatY<HV1MA> ataI gatD gatD agtS gatD ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- aatN agtS agtS tatY<HV1ZH> attI gggGggaG aatN agtS agtS aatN ---- ---- ---- ---- ---- ---- ---- ---- ggtG gatD agtS agtS aaaK tatY<SIVCZ> ctaL gggG aatN gagE aacN ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- aacN acaT tatY

1A

HV1J3

HV1B1

HV1C4

HV1A2

HV1OY

HV1RH

HV1EL

HV1ND

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2CA

HV2NZ

HV2D1

HV2G1

HV2BE

HV2D2

95

99

99

99

99

99

9999

99

99

9997

86

93

63

HV1J3

HV1B1

HV1C4

HV1A2

HV1OY

HV1RH

HV1EL

HV1ND

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2CA

HV2NZ

HV2D1

HV2G1

HV2BE

HV2D2

95

99

99

99

99

99

9999

99

99

9997

86

93

63

HV1J3

HV1B1

HV1C4

HV1A2

HV1OY

HV1RH

HV1EL

HV1ND

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2CA

HV2NZ

HV2D1

HV2G1

HV2BE

HV2D2

95

99

99

99

99

99

9999

99

99

9997

86

93

63

  256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280HV1J3 ataI aatN aatN agtS accT aagK gatD aatN ataI aaaK aatN gatD aatN agtS accT ---- ---- ---- ---- ---- agaR ---- ---- ---- tatYHV1B1 ataI ---- ---- ---- ---- ---- gatD aatN ---- ---- gatD ---- ---- actT accT agcS tatY ---- ---- acgT ---- ---- ---- ---- ----HV1C4 ataI ---- ---- ---- ---- ---- gatD gatD ---- ---- aatN aaaK aatN actT accT aacNaacN ---- ---- accT aaaK ---- ---- ---- tatYHV1A2 ataI ---- ---- ---- ---- ---- gatD aatN ---- ---- gctA agtS actT actT accT aacN tatY ---- ---- accT aacN ---- ---- ---- tatYHV1OY ataI ---- ---- ---- ---- ---- gatD aagK ---- ---- aatN gatD ---- actT ---- ---- ---- ---- ---- ---- aaaK ---- ---- ---- tttFHV1RH ataI gagE ---- ---- aagK ggtG aatN attI agcS cctP aagK aatN aatN actT agcS aatN aatN ---- ---- actT agcS tatY ggtG aacN tatYHV1ND ataI ---- ---- ---- ---- ---- gacD aatN ---- ---- ---- aatN aatN aggR ---- ---- ---- ---- ---- accT aatN agtS actT aatN tatYHV1EL ataI ---- ---- ---- ---- ---- gacD aatN ---- ---- ---- gatD agtS agtS ---- ---- ---- ---- ---- accT aatN agtS accT aatN tatYHV1Z84 ataI ---- ---- ---- ---- ---- gatD gatD ---- ---- ---- gatD aatN agtS gctA aatN accT agtS aatN accT aatN tatY accT aatN tatY

HV1MA ataI ---- ---- ---- ---- ---- gatD gatD agtS ---- ---- gatD aatN agtS ---- ---- ---- ---- ---- ---- ---- ---- ---- agtS tatYHV1ZH attI gggGggaGaatN ---- ---- ---- agtS agtS aatN ggtG gatD agtS agtS aaaK ---- ---- ---- ---- ---- ---- ---- ---- ---- tatY

SIVCZ ctaL gggG aatN ---- ---- ---- ---- gagEaacN ---- ---- ---- aacN acaT ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- tatY

HV1J3 I N N S T K D N I K N D N S T - - - - - R - - - YHV1B1 I - - - - - D N - - D - - T T S Y - - T - - - - -HV1C4 I - - - - - D D - - N K N T T N N - - T K - - - YHV1A2 I - - - - - D N - - A S T T T N Y - - T N - - - YHV1OY I - - - - - D K - - N D - T - - - - - - K - - - FHV1RH I E - - K G N I S P K N N T S N N - - T S Y G N YHV1ND I - - - - - D N - - - N N R - - - - - T N S T N YHV1EL I - - - - - D N - - - D S S - - - - - T N S T N YHV1Z84 I - - - - - D D - - - D N S A N T S N T N Y T N Y

HV1MA I - - - - - D D S - - D N S - - - - - - - - - S YHV1ZH I G G N - - - S S N G D S S K - - - - - - - - - YSIVCZ L G N - - - - E N - - - N T - - - - - - - - - - YHV1J3 ata aat aat agt acc aag gat aat ata aaa aat gat aat agt acc --- --- --- --- --- aga --- --- --- tatHV1B1 ata --- --- --- --- --- gat aat --- --- gat --- --- act acc agc tat --- --- acg --- --- --- --- ---HV1C4 ata --- --- --- --- --- gat gat --- --- aat aaa aat act acc aac aac --- --- acc aaa --- --- --- tatHV1A2 ata --- --- --- --- --- gat aat --- --- gct agt act act acc aac tat --- --- acc aac --- --- --- tatHV1OY ata --- --- --- --- --- gat aag --- --- aat gat --- act --- --- --- --- --- --- aaa --- --- --- tttHV1RH ata gag --- --- aag ggt aat att agc cct aag aat aat act agc aat aat --- --- act agc tat ggt aac tatHV1ND ata --- --- --- --- --- gac aat --- --- --- aat aat agg --- --- --- --- --- acc aat agt act aat tatHV1EL ata --- --- --- --- --- gac aat --- --- --- gat agt agt --- --- --- --- --- acc aat agt acc aat tatHV1Z84 ata --- --- --- --- --- gat gat --- --- --- gat aat agt gct aat acc agt aat acc aat tat acc aat tat

HV1MA ata --- --- --- --- --- gat gat agt --- --- gat aat agt --- --- --- --- --- --- --- --- --- agt tatHV1ZH att ggg gga aat --- --- --- agt agt aat ggt gat agt agt aaa --- --- --- --- --- --- --- --- --- tatSIVCZ cta ggg aat --- --- --- --- gag aac --- --- --- aac aca --- --- --- --- --- --- --- --- --- --- tat

1B

HIV 从哪里来 ?

Freeman & Herron, 2001. Evolutionary Analysis. Prentice Hall

2003/6/13 Science

来自不同种类猴子的两个病毒在非洲黑猩猩体内经重组后形成了引发人类艾滋病的 SIV 菌株 SIVcpz 是通过来自红盖猴和花鼻猴的 SIVs 病毒不断地传播和重组的过程变成了起源于黑猩猩的SIVcpz 的。黑猩猩捕食这两种猴子。这些猴子和黑猩猩在西部中非洲有重叠的活动区域。 人类不是通过自然状态下物种间的传播而获得两种不同 SIVs 菌株的唯一物种,这种自然状态下的物种间传播很可能是由捕食行为产生的。 黑猩猩捕食小型猴子是不是导致了它们获得其它的 SIV 感染 ? 这些 SIV 与 SIVcpa 的共同感染或与 SIVcpz 进行重组可能性有多大 ? 这些适应了黑猩猩的 SIV 是不是最终更可能感染人类 ?

HV1J3

HV1B1

HV1C4

HV1A2

HV1OY

HV1RH

HV1EL

HV1ND

HV1MA

SIVCZ

HV2CA

HV2NZ

HV2BE

HV2D1

HV2G1

SIVM1

SIVV1

SIVG1

SIVGB

7494

90

100

100

100

84

100100

91

100

94100

78

8882

HV1J3 HV1B1 HV1C4 HV1A2 HV1OY HV1RH HV1EL HV1ND HV1MA SIVCZ HV2CA HV2NZ HV2BE HV2D1 HV2G1 SIVM1 SIVV1 SIVG1 SIVGB

49

100

73

100

100

100

98

100 100100

99

82

100

74

43

67

HV1C4

HV1J3

HV1B1

HV1A2

HV1RH

HV1OY

HV1EL

HV1ND

HV1MA

SIVCZ

HV2D1

HV2G1

HV2BE

HV2CA

HV2NZ

SIVM1

SIVG1

SIVV1

SIVGB

3898

44

96

100

100

85

10098

100

92

4798

31

17

16

HV1C4

HV1B1

HV1J3

HV1RH

HV1A2

HV1OY

HV1EL

HV1ND

HV1MA

SIVCZ

HV2BE

HV2D1

HV2G1

HV2NZ

HV2CA

SIVM1

SIVG1

SIVV1

SIVGB

100

5284

7298

100

99

10099

83

100

71

38

27

30

100

2B DNA+PRO

gag

gag

2A Protein only

env

env

BamHI homologs

OkrAI

HchORF2488P

Bsp98I

Csp7822ORF584P

HauORF2756P

BsuBSP1ORFAP

DdsI

BamHI

10075

6440

61

OkrAI

HchORF2488P

Bsp98I

Csp7822ORF584P

HauORF2756P

BsuBSP1ORFAP

DdsI

BamHI

9763

48

38

42

3B DNA+PRO

3A Protein only

JH9 JH1 MW2 MSSA476 USA300 TCH1516 USA300 FPR3757 str NewMan NCTC COL MRSA252 RP62A TM300

100

100

100

86

93 14

100

1119

JH9 JH1 MW2 MSSA476 MRSA252 COL USA300 TCH1516 str NewMan NCTC USA300 FPR3757 RP62A TM300

99

99

79

66

100

100

129

13

SAUSA300_2431 homologs

4B DNA+PRO

4A Protein only

4C

A Robust multi-gene phylogenetic tree

HV1J3 HV1OY HV1B1 HV1C4 HV1A2 HV1RH HV1EL HV1ND HV1Z84 HV1MA HV1ZH SIVCZ HV2CA HV2NZ HV2D1 HV2BE HV2G1 HV2D2

54

9954

99

100

8497

100

84

83

5599

27

17

39

S1A ClustalW

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

<HV1J3> A L F Y K H D V V P I N N S T K D N I K N - - - - D N S T R Y R L I S C N T S V I<HV1OY> A L F Y K L D V L P I D - - - - - - - - - - - - - K N D T K F R L I H C N T S T I<HV1B1> A F F Y K L D I I P I D N - - - - - - - - - - - - - D T T S Y T L T S C N T S V I<HV1C4> A L F Y K L D V E P I D D N K N T - - - - - - - - T N N T K Y R L I N C N T S V I<HV1A2> A L F R N L D V V P I D N A S T T - - - - - - - - T N Y T N Y R L I H C N R S V I<HV1RH> A L F Y K L D V V P I E K G N I S P K N N T S N N T S Y G N Y T L I H C N S S V I<HV1ND> A L F Y K L D I V P I D N N N - - - - - - - - - R T N S T N Y R L I N C D T S T I<HV1EL> A L F Y R L D I V P I D N D S - - - - - - - - - S T N S T N Y R L I N C N T S A I<HV1Z84> A L F Y R L D V V P I D D D N S A N T S - - - - N T N Y T N Y R L I N C N T S A I<HV1MA> A T F Y N L D L V Q I D D S D - - - - - - - - - - - - N S S Y R L I N C N T S V I<HV1ZH> S L F Y R L D I V P I G G N S S N - - - - - - - - G D S S K Y R L I N C N T S A I<SIVCZ> S L F Y V E D V V N L G N E N - - - - - - - - - - - - - N T Y R I I N C N T T A I<HV2NZ> E A W Y S K D V V C D N - - - N T S S - - - - - - - - Q S K C Y M N H C N T S V I<HV2CA> E T W Y S S D V V C D N S T D Q T T N - - - - - - - - E T T C Y M N H C N T S V I<HV2G1> E T W Y S K D V V C E S N N T K D G - - - - - - - - - K N R C Y M N H C N T S V I<HV2D1> D A W Y S R D V V C D K T N G - - - - - - - - - - - - T G T C Y M R H C N T S V I<HV2BE> D T W Y L E D V V C D N T T - - - - - - - - - - - - - A G T C Y M R H C N T S I I<HV2D2> D T W Y S E D L E C N N T R - - - K Y - - - - - - - - T S R C Y I R T C N T T I I

S1BMAFFT

HV1J3 HV1B1 HV1A2 HV1RH HV1C4 HV1OY HV1ND HV1EL HV1Z84 HV1MA HV1ZH SIVCZ HV2NZ HV2CA HV2G1 HV2D1 HV2BE HV2D2

5499

6397

100

9793

75

100

63

42

2347

21

100

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

<HV1J3> A L F Y K H D V V P I N N S T K D N I K N D N S T - - - - R Y R L I S C N T S V I<HV1B1> A F F Y K L D I I P I D N - - - - - - - - - D T T - - - - S Y T L T S C N T S V I<HV1A2> A L F R N L D V V P I D N A S T T - - - - T N Y T - - - - N Y R L I H C N R S V I<HV1RH> A L F Y K L D V V P I E K G N I S P K N N T S N N T S Y G N Y T L I H C N S S V I<HV1C4> A L F Y K L D V E P I D D N K N T - - - - T N N T - - - - K Y R L I N C N T S V I<HV1OY> A L F Y K L D V L P I D - - - - - - - - - K N D T - - - - K F R L I H C N T S T I<HV1ND> A L F Y K L D I V P I D N N N R - - - - - T N S T - - - - N Y R L I N C D T S T I<HV1EL> A L F Y R L D I V P I D N D S S - - - - - T N S T - - - - N Y R L I N C N T S A I<HV1Z84> A L F Y R L D V V P I D D D N S A N T S N T N Y T - - - - N Y R L I N C N T S A I<HV1MA> A T F Y N L D L V Q I D D - - - - - - - - S D N S - - - - S Y R L I N C N T S V I<HV1ZH> S L F Y R L D I V P I G G N S S N - - - - G D S S - - - - K Y R L I N C N T S A I<SIVCZ> S L F Y V E D V V N L G N E N N - - - - - - - - - - - - - T Y R I I N C N T T A I<HV2NZ> E A W Y S K D V V C D N N T - - - - - - - S S Q S - - - - K C Y M N H C N T S V I<HV2CA> E T W Y S S D V V C D N S T D Q T - - - - T N E T - - - - T C Y M N H C N T S V I<HV2G1> E T W Y S K D V V C E S N N T K - - - - - D G K N - - - - R C Y M N H C N T S V I<HV2D1> D A W Y S R D V V C D K T N - - - - - - - - G T G - - - - T C Y M R H C N T S V I<HV2BE> D T W Y L E D V V C D N T T - - - - - - - - - A G - - - - T C Y M R H C N T S I I<HV2D2> D T W Y S E D L E C N N T R - - - - - - - K Y T S - - - - R C Y I R T C N T T I I

S1C MUSCLE

HV1J3

HV1B1

HV1C4

HV1A2

HV1OY

HV1RH

HV1EL

HV1ND

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2CA

HV2NZ

HV2D1

HV2G1

HV2BE

HV2D2

95

99

99

99

99

99

9999

99

99

9997

86

93

63

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

<HV1J3> A L F Y K H D V V P I N N S T - - - - K D N I K N D N S T R Y R L I S C N T S V I<HV1B1> A F F Y K L D I I P I D - - - - - - - - - - - - - N D T T S Y T L T S C N T S V I<HV1C4> A L F Y K L D V E P I D - - - - - - - - D N K N T T N N T K Y R L I N C N T S V I<HV1A2> A L F R N L D V V P I D - - - - - - - - N A S T T T N Y T N Y R L I H C N R S V I<HV1OY> A L F Y K L D V L P I D - - - - - - - - - - - - - K N D T K F R L I H C N T S T I<HV1RH> A L F Y K L D V V P I E K G N I S P K N N T S N N T S Y G N Y T L I H C N S S V I<HV1ND> A L F Y K L D I V P I D - - - - - - - - - N N N R T N S T N Y R L I N C D T S T I<HV1EL> A L F Y R L D I V P I D - - - - - - - - - N D S S T N S T N Y R L I N C N T S A I<HV1Z84> A L F Y R L D V V P I D D D N - - - - S A N T S N T N Y T N Y R L I N C N T S A I<HV1MA> A T F Y N L D L V Q I D - - - - - - - - - - - - D S D N S S Y R L I N C N T S V I<HV1ZH> S L F Y R L D I V P I G - - - - - - - - G N S S N G D S S K Y R L I N C N T S A I<SIVCZ> S L F Y V E D V V N L G - - - - - - - - - - - - - N E N N T Y R I I N C N T T A I<HV2NZ> E A W Y S K D V V C D N - - - - - - - - - - - N T S S Q S K C Y M N H C N T S V I<HV2CA> E T W Y S S D V V C D N - - - - - - - - S T D Q T T N E T T C Y M N H C N T S V I<HV2G1> E T W Y S K D V V C E S - - - - - - - - - N N T K D G K N R C Y M N H C N T S V I<HV2D1> D A W Y S R D V V C D K - - - - - - - - - - - - T N G T G T C Y M R H C N T S V I<HV2BE> D T W Y L E D V V C D N - - - - - - - - - - - - - T T A G T C Y M R H C N T S I I<HV2D2> D T W Y S E D L E C N N - - - - - - - - - - - T R K Y T S R C Y I R T C N T T I I

S1D T-coffee

HV1J3

HV1B1

HV1C4

HV1A2

HV1OY

HV1RH

HV1EL

HV1ND

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2CA

HV2NZ

HV2D1

HV2G1

HV2BE

HV2D2

95

99

99

99

99

99

9999

99

99

9997

86

93

63

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

<HV1J3> A L F Y K H D V V P I N N S T K D N I K N D N S T - - - - R Y R L I S C N T S V I<HV1B1> A F F Y K L D I I P I D N - - - - - - - - - D T T - - - - S Y T L T S C N T S V I<HV1C4> A L F Y K L D V E P I D D N K N T - - - - T N N T - - - - K Y R L I N C N T S V I<HV1A2> A L F R N L D V V P I D N A S T T - - - - T N Y T - - - - N Y R L I H C N R S V I<HV1OY> A L F Y K L D V L P I D - - - - - - - - - K N D T - - - - K F R L I H C N T S T I<HV1RH> A L F Y K L D V V P I E K G N I S P K N N T S N N T S Y G N Y T L I H C N S S V I<HV1ND> A L F Y K L D I V P I D N N N R - - - - - T N S T - - - - N Y R L I N C D T S T I<HV1EL> A L F Y R L D I V P I D N D S S - - - - - T N S T - - - - N Y R L I N C N T S A I<HV1Z84> A L F Y R L D V V P I D D D N S A N T S N T N Y T - - - - N Y R L I N C N T S A I<HV1MA> A T F Y N L D L V Q I D D - - - - - - - - S D N S - - - - S Y R L I N C N T S V I<HV1ZH> S L F Y R L D I V P I G G N S S N - - - - G D S S - - - - K Y R L I N C N T S A I<SIVCZ> S L F Y V E D V V N L G N E N N - - - - - - - - - - - - - T Y R I I N C N T T A I<HV2NZ> E A W Y S K D V V C D N N T - - - - - - - S S Q S - - - - K C Y M N H C N T S V I<HV2CA> E T W Y S S D V V C D N S T D Q T - - - - T N E T - - - - T C Y M N H C N T S V I<HV2G1> E T W Y S K D V V C E S N N T K - - - - - D G K N - - - - R C Y M N H C N T S V I<HV2D1> D A W Y S R D V V C D K T N - - - - - - - - G T G - - - - T C Y M R H C N T S V I<HV2BE> D T W Y L E D V V C D N T T - - - - - - - - - A G - - - - T C Y M R H C N T S I I<HV2D2> D T W Y S E D L E C N N T R - - - - - - - K Y T S - - - - R C Y I R T C N T T I I

S1E PRANK

HV1J3

HV1B1

HV1C4

HV1A2

HV1OY

HV1RH

HV1EL

HV1ND

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2CA

HV2NZ

HV2D1

HV2G1

HV2BE

HV2D2

95

99

99

99

99

99

9999

99

99

9997

86

93

63

  287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

<HV1J3> A L F Y K H D V V P I N N S T K D N I K N D - - - - - - - - - - N S T - - - - - - - - - - - - - - - - - R Y R L I S C N T S V I

<HV1B1> A F F Y K L D I I P I - - - - - - - - - D N - - - - - - - - - - D T T - - - - - - - - - - - - - - - - - S Y T L T S C N T S V I

<HV1C4> A L F Y K L D V E P I - - - - - - - - - D D - - - - - N K N T T N N T - - - - - - - - - - - - - - - - - K Y R L I N C N T S V I

<HV1A2> A L F R N L D V V P I - - - - - - - - - D N A S T T T - - - - - N Y T - - - - - - - - - - - - - - - - - N Y R L I H C N R S V I

<HV1OY> A L F Y K L D V L P I - - - - - - - - - D K - - - - - - - - - - N D T - - - - - - - - - - - - - - - - - K F R L I H C N T S T I

<HV1RH> A L F Y K L D V V P I - - - - - - - - - E KG N I S P K N N T S N N T - - - - - - - - - - - - - - S Y G N Y T L I H C N S S V I

<HV1ND> A L F Y K L D I V P I - - - - - - - - - D N - N N R - - - - - T N S T - - - - - - - - - - - - - - - - - N Y R L I N C D T S T I

<HV1EL> A L F Y R L D I V P I - - - - - - - - - D N - D S S - - - - - T N S T - - - - - - - - - - - - - - - - - N Y R L I N C N T S A I

<HV1Z84> A L F Y R L D V V P I - - - - - - - - - D D - D N S A N T S N T N Y T - - - - - - - - - - - - - - - - - N Y R L I N C N T S A I

<HV1MA> A T F Y N L D L V Q I - - - - - - - - - D D S D N S - - - - - S - - - - - - - - - - - - - - - - - - - - - Y R L I N C N T S V I

<HV1ZH> S L F Y R L D I V P I - - - - - - - - - - - - - - GG N S S NGD S S - - - - - - - - - - - - - - - - - K Y R L I N C N T S A I

<SIVCZ> S L F Y V E D V V - - - - - - - - - - - - - - - - - - N L G N E N N T - - - - - - - - - - - - - - - - - - Y R I I N C N T T A I

<HV2NZ> E AWY S K D V V - - - - - - - - - - - - - - - - - - - C D - - N N T - - - - - S - - - - S Q S K - - - - C Y MN H C N T S V I

<HV2CA> E T WY S S D V V - - - - - - - - - - - - - - - - - - - C D - - N S T - DQ T - T - - - - N E T T - - - - C Y MN H C N T S V I

<HV2G1> E T WY S K D V V - - - - - - - - - - - - - - - - - - - C E - - S N N - - - - - T K DGK N - - R - - - - C Y MN H C N T S V I

<HV2D1> D AWY S R D V V - - - - - - - - - - - - - - - - - - - C D - - K T NG - - - - T - - - - G - - T - - - - C Y MR H C N T S V I

<HV2BE> D T WY L E D V V - - - - - - - - - - - - - - - - - - - C D - - N T T - - - - - A - - - - G - - T - - - - C Y MR H C N T S I I

<HV2D2> D T WY S E D L E - - - - - - - - - - - - - - - - - - - C N - - N T R - - - - - K - - - - Y T S R - - - - C Y I R T C N T T I I

HV1J3

HV1B1

HV1C4

HV1A2

HV1OY

HV1RH

HV1EL

HV1ND

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2CA

HV2NZ

HV2D1

HV2G1

HV2BE

HV2D2

95

99

99

99

99

99

9999

99

99

9997

86

93

63

S1F DNA+PRO

  245

246 247 24

8249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

HV1J3 A L F Y K H D - V V P I N N S T K D N I K N D N S T - - - - - R - - - Y R L I S C N T S V IHV1B1 A F F Y K L D - I I P I - - - - - D N - - D - - T T S Y - - T - - - - - - L T S C N T S V IHV1C4 A L F Y K L D - V E P I - - - - - D D - - N K N T T N N - - T K - - - Y R L I N C N T S V IHV1A2 A L F R N L D - V V P I - - - - - D N - - A S T T T N Y - - T N - - - Y R L I H C N R S V IHV1OY A L F Y K L D - V L P I - - - - - D K - - N D - T - - - - - - K - - - F R L I H C N T S T IHV1RH A L F Y K L D - V V P I E - - K G N I S P K N N T S N N - - T S Y G N Y T L I H C N S S V IHV1ND A L F Y K L D - I V P I - - - - - D N - - - N N R - - - - - T N S T N Y R L I N C D T S T IHV1EL A L F Y R L D - I V P I - - - - - D N - - - D S S - - - - - T N S T N Y R L I N C N T S A IHV1Z84 A L F Y R L D - V V P I - - - - - D D - - - D N S A N T S N T N Y T N Y R L I N C N T S A IHV1MA A T F Y N L D - L V Q I - - - - - D D S - - D N S - - - - - - - - - S Y R L I N C N T S V IHV1ZH S L F Y R L D - I V P I G G N - - - S S N G D S S K - - - - - - - - - Y R L I N C N T S A ISIVCZ S L F Y V - E D V V N L G N - - - - E N - - - N T - - - - - - - - - - Y R I I N C N T T A IHV2NZ E A W Y S K D - V V - - - - - - C D - - - - N N T S S - - Q S K - - C Y M - N H C N T S V IHV2CA E T W Y S S D - V V - - - - - - C D N S T - D Q T T N - - E T T - - C Y M - N H C N T S V IHV2G1 E T W Y S K D - V V - - - - - - C E S - - - N N T K - - - D G K N R C Y M - N H C N T S V IHV2D1 D A W Y S R D - V V - - - - - - C D - - - - - K T N G - T G - T - - C Y M - R H C N T S V IHV2BE D T W Y L E D - V V - - - - - - C D - - - - - N T T - - - A G T - - C Y M - R H C N T S I IHV2D2 D T W Y S E D - L E - - - - - - C - - - - - N N T R K - Y T S R - - C Y I - R T C N T T I I

HV1W1

HV1J3

HV1B1

HV1BN

HV1C4

HV1RH

HV1A2

HV1OY

HV1ND

HV1EL

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2D1

HV2G1

HV2BE

HV2NZ

HV2CA

SIVM1

HV2D2

SIVG1

SIVV1

SIVGB

100

4043

9085

59

100

78

100

8499

75

100

59

39

35

99

29

22

13

27

HV1W1

HV1C4

HV1B1

HV1RH

HV1A2

HV1OY

HV1J3

HV1BN

HV1ND

HV1EL

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2BE

HV2D1

HV2G1

HV2NZ

HV2CA

HV2D2

SIVM1

SIVG1

SIVV1

SIVGB

100

6673

6884

48

100

97

100

8490

97

100

84 39

39

38

100

27

13

18

HV1W1

HV1C4

HV1B1

HV1RH

HV1A2

HV1J3

HV1OY

HV1BN

HV1ND

HV1EL

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2BE

HV2D1

HV2G1

HV2NZ

HV2CA

HV2D2

SIVM1

SIVG1

SIVV1

SIVGB

100

6672

6683

49

100

97

100

8492

97

100

84 38

42

36

100

25

11

17

HV1W1

HV1RH

HV1A2

HV1OY

HV1C4

HV1J3

HV1B1

HV1BN

HV1ZH

HV1ND

HV1EL

HV1Z84

HV1MA

SIVCZ

HV2BE

HV2D1

HV2G1

HV2NZ

HV2CA

SIVM1

HV2D2

SIVG1

SIVV1

SIVGB

100

5153

8682

42

100

99

100

6084

94

100

6345

37

25

32

6100

13

HV1W1

HV1C4

HV1B1

HV1A2

HV1RH

HV1BN

HV1J3

HV1OY

HV1ND

HV1EL

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2BE

HV2D1

HV2G1

HV2NZ

HV2CA

HV2D2

SIVM1

SIVG1

SIVV1

SIVGB

47

18

18

8

16

34

100

9296

93

76

100

100

4774

5267

75

100

100

99

S2B MAFFT

S2E PRANK

S2C MUSCLES2A ClustalW

S2F DNA+PRO S2D T-coffee

ENV

HV1BN

HV1J3

HV1B1

HV1C4

HV1W1

HV1A2

HV1OY

HV1RH

HV1ND

HV1EL

HV1Z84

HV1MA

HV1ZH

SIVCZ

HV2NZ

HV2CA

HV2BE

HV2G1

HV2D1

HV2D2

SIVM1

SIVV1

SIVG1

SIVGB

7493

93

100

81

100

100

83

100

9793

99

100

99

94100

66

45

47

28

44

S3B DNA+PRO

gag

gagS3A Protein only

MN

BRU

JRFL

AD8

WEAU

RF

ETH2220

92BR025

IN21068

96BW05

92UG037

Q23

IBNG

DJ264

DJ263

PVMY

100100

9980

100

100

100

88

10096

60

2499env

MN

JRFL

BRU

AD8

WEAU

RF

ETH2220

92BR025

IN21068

96BW05

92UG037

Q23

IBNG

DJ264

DJ263

PVMY

100100

8886

100

97

98

51

100

5766

7447

env

MN

JRFL

BRU

AD8

RF

WEAU

ETH2220

92BR025

IN21068

96BW05

92UG037

Q23

IBNG

DJ264

DJ263

PVMY

100100

100

59

100

100

82

100

100

82

80

67

59

MN

JRFL

BRU

AD8

RF

WEAU

ETH2220

92BR025

IN21068

96BW05

92UG037

Q23

IBNG

DJ264

DJ263

PVMY

99

97

99

95

95

100

59

99

100

74

55

56

21

DNA+Pro: www.DNAPlusPro.com