47
Introduction to Bioinformatics Zhiping Weng, 翁志萍 Program in Bioinformatics and Integrative Biology University of Massachusetts Medical School 同济学 命科学与技术学院 物信息学系

Introduction to Bioinformatics - Fudan Universityadmis.fudan.edu.cn/ds2013/sta/day1.pdfIntroduction to Bioinformatics! Zhiping Weng, 翁志萍!! Program in Bioinformatics and Integrative

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Introduction to Bioinformatics

Zhiping Weng, 翁志萍

Program in Bioinformatics and Integrative Biology University of Massachusetts Medical School

同济⼤大学 ⽣生命科学与技术学院 ⽣生物信息学系

Outline

•  Personal perspective of the history of bioinformatics

•  Examples in chronological order

Outline

•  Personal perspective of the history of bioinformatics

•  Examples in chronological order

What is Bioinformatics?

•  Informatics was coined in the early 1990s

•  Informatics = Information Science + Technology

•  Bioinformatics = Biology + Informatics (circa 1992)

1996

Boston University������

U Mass Medical School

“Bioinformatics” dates back to the 1960s

• Molecules as documents of evolutionary history���Journal of Theoretical Biology (1965) 8:357-66 ���Emile Zuckerkandl and Linus Pauling

• Computer analysis of protein evolution���Scientific American (1969) 221:86-95 ���Margaret Dayhoff

Genbank was established in 1982

Human genome was sequenced

1015

(1011)

In 1983 “bioinformatics” shook the world

•  Platelet-derived growth factor is structurally related to the putative transforming protein p28sis of simian sarcoma virus. ���Nature (1983) 304:35-9���Waterfield MD et al.

•  Simian sarcoma virus onc gene, v-sis, is derived from the gene (or genes) encoding a platelet-derived growth factor���Science 1983 221:275-7 ���Doolittle RF et al.

50 years from DNA to genomes •  1953 Watson & Crick discovered the double helix

•  1970s Recombinant DNA technology

•  1987 “Genomics” coined by T. Roderick: ���Gene + Chromosome + “ics” = Genomics

•  1990 Human Genome Project launched

•  1998 Human Genome Project accelerated by Celera Genomics Inc.

•  2001 Human Genome published twice

•  2003 “Finished” Human Genome sequence

•  2005, 2007, 2010, Human variation, Hapmap, 1000 genomes

•  2010 6216 Sequenced genomes (2313 eukaryotes)

•  2013 personal genomes, epigenomes, proteomes, etc.

Repaid development of sequencing technologies

Illumina HiSeq ABI SOLiD 454 Sequencer

Capable of resequencing a human genome in 1 week and under $5,000

Capillary

Automated capillary

454

Solexa/Illumina GA

HiSeq

GA2 GA2x

Technological improvements

Plummeting cost of DNA sequencing

Next generation sequencing applications

Gogol-Döring A, Chen W. Methods Mol Biol. 2012;802:249-57.

Outline

•  Personal perspective of the history of bioinformatics

•  Examples in chronological order

1

Two genomes enable detection of Segmental duplications (SD)

Creatine transporter (SLC6A8) and adrenoleukodystrophy (ABCD1) segmental duplication

2

Having a more ancient genome can tell history

• Whole genome duplication can supply raw materials for new functions.

•  Polyploidy causes genome instability.

• Whole genome duplication is accompanied by massive gene loss.

•  There are 8% genes in baker’s yeast Saccharomyces cerevisiae that are duplicated.

•  Did the yeast undergo whole genome duplication?

A distance relative K. Waltii reveals synteny

Kellis, ..., Lander Nature (2004) 428: 617-624

3

Assaying expression of all genes in a genome

Prediction of central nervous system embryonal tumor outcome based on gene expression Pomeroy, ..., Golub, T. Nature (2002) 415, 436-442

4

Look beyond genes in the human genome

Where are the regulatory regions?

Genome-wide detection of regulatory regions

The ENCODE (ENCyclopedia Of DNA Elements) Project"Science (2004) 306:636-640."

DNase hypersensitive sites

What are ubiquitous hypersensitive sites?

Xi, ... , Weng, Crawford, PLoS Genetics (2007) 8:8-20

5

Sequence cancer genomes

Campbell,...Stratton, Futreal. Nature Genetics (2008) 40:6-13

6

Genome rearrangement creates a fused gene

Campbell,...Stratton, Futreal. Nature Genetics (2008) 40:6-13

7

Epigenome

Bernstein, ... Lander, Cell (2007) 128:669-681

H3K4me3 in human prefrontal cortex neurons

Cheung, Shulha, ..., Weng, Akbarian, PNAS (2010)

What kind of genes are on only in neurons?

•  alkali metal ion binding

•  transmembrane transporter activity

•  voltage-gated potassium channel activity

•  synaptic transmission

•  transmission of nerve impulse

•  Cell-cell communication

•  These genes can be targets for drug development

8

Personalized medicine

Ng, ... Venter. Nature (2009) 461:724-726

A common BIM deletion polymorphism mediates intrinsic resistance and inferior responses to tyrosine kinase

inhibitors in cancer

Ng et al., Nature Medicine, 2012

9

Network analysis���Integrating all types of data at a system level

http://www.cell.com/issue?pii=S0092-8674%2811%29X0006-5

Networks in Cellular Systems Vidal et al., Cell, Volume 144, Issue 6, 986-998, 18 March 2011

10

Integrative inference of transcriptional networks

D. Marbach et al., Genome Res. 2012 July; 22(7): 1334–1349.

11

Evolution of gene regulatory networks

IS Peter and EH Davidson, Cell, Volume 144, Issue 6, 18 March 2011, Pages 970–985

12

Developing network models in

cancer ���

Dana Pe'er and Nir Hacohen. Cell ,Volume 144, Issue 6, 18 March

2011, Pages 864-873

Welcome  join  us!    

[email protected]