Upload
iago
View
30
Download
0
Embed Size (px)
DESCRIPTION
From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical Interactome of Yeast. Shoshana Wodak Hospital for Sick Children [email protected] Depts. Biochemistry & Medical Genetics and Microbiology University of Toronto. Swiss-Prot Fortaleza 2006. B. - PowerPoint PPT Presentation
Citation preview
From High Throughput Pull-DownsTo Protein Complexes:
Building a Model of the Physical Interactome of Yeast
Shoshana WodakHospital for Sick [email protected]. Biochemistry &
Medical Genetics and MicrobiologyUniversity of Toronto
Swiss-Prot Fortaleza 2006
Network of physically interacting proteins
Protein complexes and the physical interactome Protein complexes and the physical interactome
B
Complexesare the cell’sfactories
-spliceosome-proteasome-ribosome-replication compl. -…-cytochrome bc1-----------------
Essential role!
Complexesare the cell’sfactories
-spliceosome-proteasome-ribosome-replication compl. -…-cytochrome bc1-----------------
Essential role! Our knowledge about them is limited… -can be rather dynamic entities, with variable life times-their formation is likely regulated at various levels, transcriptional level, post transcriptional modification, degradation…
Our knowledge about them is limited… -can be rather dynamic entities, with variable life times-their formation is likely regulated at various levels, transcriptional level, post transcriptional modification, degradation…
1st step: mapping the physical interactome 1st step: mapping the physical interactome
Mapping binary interactions:two hybrid screens split ubiquitin screens(membrane proteins)
Characterizing complexes:over expression & single affinity purification +MSTandem affinity purification +MS
Most extensive studies done in the yeast S. cerevisiae
Many low throughput studies
MIPS/CYGD & SGD databases Now containing ~215-230 hand-curated protein complexes for S. cerevisiae
Several HTP studies 2YHB: Ito, et al. (2002) Uetz et al. (2002) AP&TAP: Ho, et al. (2002)
Gavin et al. (2002) Gavin et al. (2006) Krogan et a. (2006)
Krogan et al. (2338)
Ho et al. (573)
Ho et al. (1389)
Krogan et al. (5333)
Gavin et al. (2671)
Gavin et al. (1993)
Collin et al. (2006)
25
399
4720
Gavin/Krogan
Overlap ≤ 5%
5% < overlap ≤ 50%
50% < overlap ≤ 90%
90% < overlap
Similarities and differences between the two 2006 studies (Gavin/ Krogan) Similarities and differences between the two 2006 studies (Gavin/ Krogan)
# Baits # Preys
Gavin complexes
(491)
Krogan complexes
(547)≠
???
High throughput study of Korgan et al. (2006) High throughput study of Korgan et al. (2006)
(I)
(II)
(V)
(III)
(IV)
Validation and Analysis
MALDI/MS LC/MS
Deriving the PPI Network
Identifying Functional Modules
Interaction scorecalculation
Representing interactions
PPI graph
wi
Spokes
bait
prey prey
prey
prey
prey
Matrix
prey
bait
preyprey
preyprey
bait
bait
bait
bait
bait
prey
prey
prey
prey
prey
0.05
0.90
0.90
0.32
0.54
MS analysis
Computing interaction scores Computing interaction scores
Gold standard reference PPIderived from MIPS/SG
complexes
B
B
TN
TP
Krogan et al (2006) -combining data from ≠purifications ≠ different MS techniques -only bait-prey associations -complex ‘training’ procedure -ignored ribosomal proteins(baits)
Collin et al (2006) [Consolidated network] -combined data from Gavin and Krogan -bait-prey + prey-prey associations -new Protein Enrichment (PE) score: augmented version of Gavin’s scores + ‘training’ -> Confidence scores
Gavin et al (2006) -combining data from ≠purifications -bait-prey + prey-prey associations -unbiased statistical procedure, log-odds based
0
500
1000
1500
2000
2500
3000
3500
4000
0 20 40 60 80 100 120 140 160 180 200
# False positive PP interactions
# T
rue
po
siti
ve P
P in
tera
ctio
ns
ConsolidatedGavinKroganMIPS_small_scale
Core data
S = 0.38
MIPS small scale
0
2000
4000
6000
0 1000 2000
Comparing the PPI networks Comparing the PPI networks
2708 proteins7123 interactions
1622 proteins9074 interactions
(I)
(II)
(V)
(III)
(IV)
Validation and Analysis
MALDI/MS LC/MS
Deriving the PPI Network
Identifying Functional Modules
Protein complexes are expected to ‘share’ componentsProtein complexes are expected to ‘share’ components
Unique components
Shared components
?? Physical interactionPhysical interaction
C-1
C-2
C-1
C-2
‘Recruitment’ time; condition‘Recruitment’ time; condition
This information is however currently not available from the purification data. The pulled down complexes representtemporal and spatial averages of the in-vivo distribution.
Markov Cluster Algorithm (MCL)
Enright et al. (2002); Van Dongen S. (2002)
Hierarchic Clustering
by near neighbor contactscore, or neighbor pattern
Simulates random walks within graphsby computing highermoments of contactMatrix =Measures similarityin path lengths 1,2,3,4between nodes inthe graph
Parsing the PPI networkinto densely connectedregions
Common approach:
0
5
10
15
20
25
30
35
40
45
MIPS Consolidated MCL+overlap
Gavin (all) Gavin(core+module)
Mea
n
Overlaps per complexShared genes per overlappingcomplex
Fraction of complexes sharing subunits with other complexes
41.4% 19.5% 96.9% 84.1%
Degree of overlap between complexes computed usingdifferent PPI networks and different methods Degree of overlap between complexes computed usingdifferent PPI networks and different methods
(I)
(II)
(V)
(III)
(IV)
Validation and Analysis
MALDI/MS LC/MS
Deriving the PPI Network
Identifying Functional Modules
87
341
5310
291177
4732
77
71
35
20
209
99
50
42
Overlap ≤ 5%
5% < overlap ≤ 50%
50% < overlap ≤ 90%
90% < overlap
Gavin(491)
Krogan (547)
Consolidated_MCL (400)
Gavin_MCL (203)
00.10.20.30.40.50.60.70.80.9
1
Precision Homogeneity
Score
Gavin
Gavin_MCL
Krogan
This study
0
0.2
0.4
0.6
0.8
1
Gavin
Gavin_MCL
KroganThis study
MIPS
PPV score
024681012
Semantic Similarity (SS)
Co-localization
SS per cluster
Conso
lidat
ed
M
CL
Overlap with MIPS complexes
Overlap with MIPS complexes
Cellular localizationGo annotations
0
0.2
0.4
0.6
0.8
1
Precision Homogeneity
Score
Gavin/Krogan
Gavin_MCL/Krogan
Gavin_PE/Krogan_PE
25
399
4720
13
9858
34 30
140
40
111Overlap ≤ 5%
5% < overlap ≤ 50%
50% < overlap ≤ 90%
90% < overlap
Gavin/Krogan Gavin_MCL/Krogan Gavin_PE/Krogan_PE
(a)
(b)
(491) (547) (203) (547) (321) (640)
Ribosomal Small Subunit
Ribosomal Large Subunit
RNA Pol. I, II, III
19/22S Regulator
20S Proteasome
RSC
Mediator
Exosome
Mitochondrial Ribosome
TFIIIC
MRP RNase
APC
COP I
Golgi Transport
Exocyst
SRP
SNF1
H+ Transporting ATPase, Vacuolar
SAGA
bc
da
GeneProVlasblom et al. (2006)
POL II
POL IIIPOL I
SAGA-like complexTFIID
SAGA complex
ADA complex
Fig. 8c
(I)
(II)
(V)
(III)
(IV)
Validation and Analysis
MALDI/MS LC/MS
Deriving the PPI Network
Identifying Functional Modules
Protein 3D structure
Diffraction Pattern
Phasecalculation
Model refinement
AcknowledgementsAcknowledgements
Shuye Pu (HSC, Toronto)James Vlasblom (HSC, Toronto)Chris Orsi (HSC, Toronto)Mark Superina (HSC, Toronto)Gina Liu (HSC, Toronto)CCB Systems Support team (HSC, Toronto)
Nicolas Simonis (ULB Belgium)Jacques van Helden (ULB, Belgium)Sylvain Brohée (ULB, Belgium)
Nevan Krogan (B&B Toronto/ HHMI,UCSF)
Jack Greenblatt (B&B,Toronto)Sean Collins (HHMI,UCSF)Jonathan Weissman (HHMI,UCSF)Andrew Emili (B&B, Toronto)John Parkinson (HSC, Toronto)Haiyuan Yu (MBB, Yale U.)Mark Gerstein (MBB,Yale U.)
2R = 0.90
1
10
100
1000
1 10 100 1000Degree
Num
ber
of p
rote
ins
Average node degree = 12.530 Average complex size = 5.245
R2 = 0.72
1
10
100
1000
1 10 100 1000
Complex sizeN
umbe
r of
com
plex
es
(a) (b)
0
0.05
0.1
0.15
0.2
0.25
0.3
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Uncentered Pearson Correlation Coefficient
Fra
ctio
n
Within complexes
Between complexes
Random networkes
Figure S6
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
00.050.10.150.20.250.30.350.4
Confidence Score Cutoff
precision; homogeneity score
Varying_ proteinsVarying_proteinsVarying_interactionsVarying_interactions
Confidence score cutoff 0.38 0.23 0.15 0.10 0.05
Varying both the number of proteins and the number of interactions
#proteins 1921 2270 2703 3625 4489
#interactions 12035 15060 19133 26149 38895
Varying only the number of interactions among the top 1921 proteins
#proteins 1921 1921 1921 1921 1921
#interactions 12035 14647 17795 21920 27816Fig. 5
(a)
(b)
Precision
Homogeneity
B
EAYC
D
+Other cellular proteins
B
C
E
= YLR258w
= YER133w
= YER054c
A
Y
D
= YPR184w
= YKL085w
= YPR160w
Bank of ORF's fused with a tag
Expression in yeast and lysis
Tandem affinity purification
Identification of components by
Mass Spec
Y
y TAPORF tag
1D SDS PAGE