26
From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical Interactome of Yeast Shoshana Wodak Hospital for Sick Children [email protected] Depts. Biochemistry & Medical Genetics and Microbiology University of Toronto Swiss-Prot Fortaleza 2006

From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

  • Upload
    iago

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical Interactome of Yeast. Shoshana Wodak Hospital for Sick Children [email protected] Depts. Biochemistry & Medical Genetics and Microbiology University of Toronto. Swiss-Prot Fortaleza 2006. B. - PowerPoint PPT Presentation

Citation preview

Page 1: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

From High Throughput Pull-DownsTo Protein Complexes:

Building a Model of the Physical Interactome of Yeast

Shoshana WodakHospital for Sick [email protected]. Biochemistry &

Medical Genetics and MicrobiologyUniversity of Toronto

Swiss-Prot Fortaleza 2006

Page 2: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

Network of physically interacting proteins

Protein complexes and the physical interactome Protein complexes and the physical interactome

B

Complexesare the cell’sfactories

-spliceosome-proteasome-ribosome-replication compl. -…-cytochrome bc1-----------------

Essential role!

Complexesare the cell’sfactories

-spliceosome-proteasome-ribosome-replication compl. -…-cytochrome bc1-----------------

Essential role! Our knowledge about them is limited… -can be rather dynamic entities, with variable life times-their formation is likely regulated at various levels, transcriptional level, post transcriptional modification, degradation…

Our knowledge about them is limited… -can be rather dynamic entities, with variable life times-their formation is likely regulated at various levels, transcriptional level, post transcriptional modification, degradation…

Page 3: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

1st step: mapping the physical interactome 1st step: mapping the physical interactome

Mapping binary interactions:two hybrid screens split ubiquitin screens(membrane proteins)

Characterizing complexes:over expression & single affinity purification +MSTandem affinity purification +MS

Most extensive studies done in the yeast S. cerevisiae

Many low throughput studies

MIPS/CYGD & SGD databases Now containing ~215-230 hand-curated protein complexes for S. cerevisiae

Several HTP studies 2YHB: Ito, et al. (2002) Uetz et al. (2002) AP&TAP: Ho, et al. (2002)

Gavin et al. (2002) Gavin et al. (2006) Krogan et a. (2006)

Page 4: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

Krogan et al. (2338)

Ho et al. (573)

Ho et al. (1389)

Krogan et al. (5333)

Gavin et al. (2671)

Gavin et al. (1993)

Collin et al. (2006)

25

399

4720

Gavin/Krogan

Overlap ≤ 5%

5% < overlap ≤ 50%

50% < overlap ≤ 90%

90% < overlap

Similarities and differences between the two 2006 studies (Gavin/ Krogan) Similarities and differences between the two 2006 studies (Gavin/ Krogan)

# Baits # Preys

Gavin complexes

(491)

Krogan complexes

(547)≠

???

Page 5: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

High throughput study of Korgan et al. (2006) High throughput study of Korgan et al. (2006)

Page 6: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

(I)

(II)

(V)

(III)

(IV)

Validation and Analysis

MALDI/MS LC/MS

Deriving the PPI Network

Identifying Functional Modules

Page 7: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

Interaction scorecalculation

Representing interactions

PPI graph

wi

Spokes

bait

prey prey

prey

prey

prey

Matrix

prey

bait

preyprey

preyprey

bait

bait

bait

bait

bait

prey

prey

prey

prey

prey

0.05

0.90

0.90

0.32

0.54

MS analysis

Page 8: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

Computing interaction scores Computing interaction scores

Gold standard reference PPIderived from MIPS/SG

complexes

B

B

TN

TP

Krogan et al (2006) -combining data from ≠purifications ≠ different MS techniques -only bait-prey associations -complex ‘training’ procedure -ignored ribosomal proteins(baits)

Collin et al (2006) [Consolidated network] -combined data from Gavin and Krogan -bait-prey + prey-prey associations -new Protein Enrichment (PE) score: augmented version of Gavin’s scores + ‘training’ -> Confidence scores

Gavin et al (2006) -combining data from ≠purifications -bait-prey + prey-prey associations -unbiased statistical procedure, log-odds based

Page 9: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

0

500

1000

1500

2000

2500

3000

3500

4000

0 20 40 60 80 100 120 140 160 180 200

# False positive PP interactions

# T

rue

po

siti

ve P

P in

tera

ctio

ns

ConsolidatedGavinKroganMIPS_small_scale

Core data

S = 0.38

MIPS small scale

0

2000

4000

6000

0 1000 2000

Comparing the PPI networks Comparing the PPI networks

2708 proteins7123 interactions

1622 proteins9074 interactions

Page 10: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

(I)

(II)

(V)

(III)

(IV)

Validation and Analysis

MALDI/MS LC/MS

Deriving the PPI Network

Identifying Functional Modules

Page 11: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

Protein complexes are expected to ‘share’ componentsProtein complexes are expected to ‘share’ components

Unique components

Shared components

?? Physical interactionPhysical interaction

C-1

C-2

C-1

C-2

‘Recruitment’ time; condition‘Recruitment’ time; condition

This information is however currently not available from the purification data. The pulled down complexes representtemporal and spatial averages of the in-vivo distribution.

Page 12: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

Markov Cluster Algorithm (MCL)

Enright et al. (2002); Van Dongen S. (2002)

Hierarchic Clustering

by near neighbor contactscore, or neighbor pattern

Simulates random walks within graphsby computing highermoments of contactMatrix =Measures similarityin path lengths 1,2,3,4between nodes inthe graph

Parsing the PPI networkinto densely connectedregions

Common approach:

Page 13: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

0

5

10

15

20

25

30

35

40

45

MIPS Consolidated MCL+overlap

Gavin (all) Gavin(core+module)

Mea

n

Overlaps per complexShared genes per overlappingcomplex

Fraction of complexes sharing subunits with other complexes

41.4% 19.5% 96.9% 84.1%

Degree of overlap between complexes computed usingdifferent PPI networks and different methods Degree of overlap between complexes computed usingdifferent PPI networks and different methods

Page 14: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

(I)

(II)

(V)

(III)

(IV)

Validation and Analysis

MALDI/MS LC/MS

Deriving the PPI Network

Identifying Functional Modules

Page 15: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

87

341

5310

291177

4732

77

71

35

20

209

99

50

42

Overlap ≤ 5%

5% < overlap ≤ 50%

50% < overlap ≤ 90%

90% < overlap

Gavin(491)

Krogan (547)

Consolidated_MCL (400)

Gavin_MCL (203)

00.10.20.30.40.50.60.70.80.9

1

Precision Homogeneity

Score

Gavin

Gavin_MCL

Krogan

This study

0

0.2

0.4

0.6

0.8

1

Gavin

Gavin_MCL

KroganThis study

MIPS

PPV score

024681012

Semantic Similarity (SS)

Co-localization

SS per cluster

Conso

lidat

ed

M

CL

Overlap with MIPS complexes

Overlap with MIPS complexes

Cellular localizationGo annotations

Page 16: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

0

0.2

0.4

0.6

0.8

1

Precision Homogeneity

Score

Gavin/Krogan

Gavin_MCL/Krogan

Gavin_PE/Krogan_PE

25

399

4720

13

9858

34 30

140

40

111Overlap ≤ 5%

5% < overlap ≤ 50%

50% < overlap ≤ 90%

90% < overlap

Gavin/Krogan Gavin_MCL/Krogan Gavin_PE/Krogan_PE

(a)

(b)

(491) (547) (203) (547) (321) (640)

Page 17: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

Ribosomal Small Subunit

Ribosomal Large Subunit

RNA Pol. I, II, III

19/22S Regulator

20S Proteasome

RSC

Mediator

Exosome

Mitochondrial Ribosome

TFIIIC

MRP RNase

APC

COP I

Golgi Transport

Exocyst

SRP

SNF1

H+ Transporting ATPase, Vacuolar

SAGA

bc

da

GeneProVlasblom et al. (2006)

Page 18: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

POL II

POL IIIPOL I

Page 19: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

SAGA-like complexTFIID

SAGA complex

ADA complex

Fig. 8c

Page 20: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

(I)

(II)

(V)

(III)

(IV)

Validation and Analysis

MALDI/MS LC/MS

Deriving the PPI Network

Identifying Functional Modules

Protein 3D structure

Diffraction Pattern

Phasecalculation

Model refinement

Page 21: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

AcknowledgementsAcknowledgements

Shuye Pu (HSC, Toronto)James Vlasblom (HSC, Toronto)Chris Orsi (HSC, Toronto)Mark Superina (HSC, Toronto)Gina Liu (HSC, Toronto)CCB Systems Support team (HSC, Toronto)

Nicolas Simonis (ULB Belgium)Jacques van Helden (ULB, Belgium)Sylvain Brohée (ULB, Belgium)

Nevan Krogan (B&B Toronto/ HHMI,UCSF)

Jack Greenblatt (B&B,Toronto)Sean Collins (HHMI,UCSF)Jonathan Weissman (HHMI,UCSF)Andrew Emili (B&B, Toronto)John Parkinson (HSC, Toronto)Haiyuan Yu (MBB, Yale U.)Mark Gerstein (MBB,Yale U.)

Page 22: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

2R = 0.90

1

10

100

1000

1 10 100 1000Degree

Num

ber

of p

rote

ins

Average node degree = 12.530 Average complex size = 5.245

R2 = 0.72

1

10

100

1000

1 10 100 1000

Complex sizeN

umbe

r of

com

plex

es

(a) (b)

Page 23: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

0

0.05

0.1

0.15

0.2

0.25

0.3

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Uncentered Pearson Correlation Coefficient

Fra

ctio

n

Within complexes

Between complexes

Random networkes

Figure S6

Page 24: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

00.050.10.150.20.250.30.350.4

Confidence Score Cutoff

precision; homogeneity score

Varying_ proteinsVarying_proteinsVarying_interactionsVarying_interactions

Confidence score cutoff 0.38 0.23 0.15 0.10 0.05

Varying both the number of proteins and the number of interactions

#proteins 1921 2270 2703 3625 4489

#interactions 12035 15060 19133 26149 38895

Varying only the number of interactions among the top 1921 proteins

#proteins 1921 1921 1921 1921 1921

#interactions 12035 14647 17795 21920 27816Fig. 5

(a)

(b)

Precision

Homogeneity

Page 25: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical
Page 26: From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical

B

EAYC

D

+Other cellular proteins

B

C

E

= YLR258w

= YER133w

= YER054c

A

Y

D

= YPR184w

= YKL085w

= YPR160w

Bank of ORF's fused with a tag

Expression in yeast and lysis

Tandem affinity purification

Identification of components by

Mass Spec

Y

y TAPORF tag

1D SDS PAGE