View
529
Download
1
Category
Preview:
Citation preview
Beyond Static Networks: A Bayesian Non-ParametricApproach
Michael P.H. Stumpf
Theoretical Systems Biology Group, Department ofLife Sciences, Imperial College Londonwww.theosysbio.bio.ic.ac.uk
22nd July 2013
Networks: Mapping Processes and Understanding
����������
�� ���
��� ����
����� �� ����
����
�������
��������
���
����������������
�����������
��������� ����
��������� ����
��� ���� ����
����������������������������
���
���� �� � ��
��
���������������� ������ ������������ ��� ��� ���� ����� �����
���� �� �������
���
��� ���
���������������
��������
�������
��
��
� �
��
�
��
�
�
�
��
��
��
��
��
��
���
���
���
���
���
���
���
���
��
����
����
����
��
��
��
�
��
������
���
���
�������
���
��������������
���
������
������
�� ���������
��
��
��
��
��
��
��
��
�
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
����
��
��
��
��
��
��
��
��
��
����
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
�� ��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
����
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
��
����
��
��
��
����
��
��
��
���� !
�"#$%��&�'�
(%�'�)*
+�'(#$�)�'��%'�+�(%�'�)*
,')--�������
�./�'�)*
����'��/��0&)'��'�� '
1�)2%&�$�$ 3
� ����0/ '32��%*
4$���/(%'�
5��(&')#��'���%*
�)�/�(2��%*
�)$��&��+�(%��/
���������� ��� �
(� �'��%'�
6�()�/&')����
������������
������ ������������ �
�� ������ �� �
���7��/�'2��%*
���� ������� ����� � ������
�!������ �� �
���� ��� �� �
��������� �����
"�� ������ ��� �
# ����� �� �$�
# ����� �� %����
& ��������� ��� �
� ������� �%�����!�'������ �%����
(�����)���� ���
*������� �
�� �� �����
+� ���������������� �
��� ���� ��� �
����!����� �
������� � �%����
# ��� �!��������,��
��� �����������(��������
(��������
������������ ��-�
������ �
�'���� �����
# ��� �!�������
$�� �� �
�2"�/0��'(3��/
�'�2% !�'���'����������
��.��� ����
,�� �� ��
/� ���� ��
�#$8��!�"/�
� ����-�
& ��������
�� !���� �%����
�����!�������
��� �������������!�'�������
%��� ����
����!��������
0������������ 1�������
/'�����!�������
�����������
��%09�'"��/ '3
� ���������
(����������'���
/� �+� )�� �
����'���
/���� ��%� ������ �
2��������� �
����������
�!������!����$�
��� �!������� �
/�������� �$�
�!�'������� �
� ������� �
��������� ��
&' ����� ���������!����$�
+� ��������/�!��� �
%��$��������
���3� ��������
1�'-���(%'�
+�� �� ����
%������� ��-�����
&���������� � �-���� �
�������� �
������� �
�������� ����� -!��
�� ����� �
$��������
+'!�� �� -!��
# �������
+� ��������
4� !��� �
�!�'����
� ����� ��
��������
#��� ��!��� �
0��������� � -!��
"�����!�� �� �
/������� �& �5 ' �!����� �4� !��� �
+�� ���� �
# ���'���!����� �
/� �- ������ �
&����� ���!�
%������ ����
�-����� �
����������
�!���!�� �+� ��
+. ���!���+�����
/����� �� �
# ��� �!��� ���������
���������
�����3��
+������������ �����6�� �
�������� ������
����: � �"(#$�'�6�'%��
$�� � ���" ���(����
�����7����!��,��
(�!�� ��� �� �����
����� !��� �
��3� �6�����
�������� �����
���!�� ��� ���������� ���������������
�������� �
�� ���� ��� �
��6� ��!�� ������
&-������� �
%� ��� � �� �
*���� ������������
(�������!�'����
/� �- ��������
��%$�)(��%���"%*
������ �!� ���� �����
,��� ��������
������� � ������(-�������� ������
� �������!������
���������������
7�����7� �
$���������
,�����$����-���
# ��� �!�������������
�������!�� �&� ���
�!���5�� �
1�� ���������������
& �������
$����� ���������
+������-�
+������� �8911�� �� �
��������� �����
����� ��� �� ��� �
%������ �����
�������
������ �����������(�� ����������� ��� ������� ��:��������
%���������������
,�6�����������
��!������
�� ����
0����/���� �����0�� �� ������
�� ����� �
%� ��� ��� �
������������������ �
+�� ���� �������
������
������1�� ����������
,�6�����������
����������
/���� ����
/�������� ��� �
#�� ������� ������
$� ��� �
�!��� ��������� �
(����!����� �� ��� � �����������
*������ �
# �������������(��������� �
�� �!��������
������� �%����
�� ���� ��� �
%���!�.�� ��� ��������� �
����(��������� �����
�������������� �� �
, ������� �
�� ����� �����
������
�� ��������
, ������� � �$�
%� ������!����� �
�!�� ����� �� �,����� �
(���� �
����� �
/� ��!����!�� �����
,���������
0��������� -!��
9�((� :�5;;
��)&�/"��%��4�%�'&'�#$)������ ����(� �����3� ��� �+���(����<�;$�'� %%��&)'��2���� �� �����2������ 3� ��� (";<� ("=<� (�>;� ������ ������' ��!��
�� ��������������� ������
��
� �����������
�
�
��������
���������
�� �����������������������
��
� �!����
� "�� ���
�
�� #�������� ��������
�������������������������
�
$�����
�
%� ���� � �
�
����������&#�'�%� ����()���*��
�
�������
�
��+� ��)����
�,���� �
���
$� �����,����
�
�� � �%���
�
���� ����
�
���
��-����������
�������������)������
� �
� ������
�
������������
�
��
�
��
=������/� �� �������� ���� �*� ��� ���� �����?�*&@
� �
� �� �
�� ��
��������.��-��������������������.��������%� ���� � �.�������������������.��-� ����/������������������� ������������������������� ���
�
��
��
� �
������������������������.���!����/&.�������,� /*#��������� ���������.����� ��#��������� ���������.���!����/�,��������.�����)���� �'�,��������.������������� �������������.���,��������������.��+� ��)����� �
��
��
����� �������.����� ����/��� �������.��0��� ������&"�� ����.�*� �!����.������������� �!����.��+������&.�������������*& �!����.�*������������.�����)������ &����1����*��������������������������������������.��+�����������������������������.��%��, �����+���
� �� �
� �
������������ ��2��.��3���� ��2�+������.��$�������������,� /�.��#��))��0�����������,� /�.��4�����������+� /������.����5����,� /
� �� �� �� �
�������������%���.�����, �������� �%����.��� �1��������$� ������,�����.��$����� �����.�����)���� ��2�-������ ��2��.��$� ������ ��� /�� �
�
������������ ������������6�����'�7897�:897�3����6������;�������� �������'�7897�<877�3��
� � ���������� ��� ������������������������������������� ����!�"���������!�� �������������#������
�
�� ;
��'�"���"�"����%*������� �����/��� ��
�A2��������������2����� ��' ��!������������������� ����� ������ ����������������
����������(� �����3� ��� ��� ����� ���������
�� �!������������������������ ����� �������������
���%������� �����#�� ���������� ��� ����������
#� ������������������� �������������
:��� ��� �,���������������������������
:�
�� �� �� ��� �:� �� 8%���� ������������������� ����� ���� ��������� �������
:� �� �������������-�� �(����������� ������ ������� �������
��''"�'�3'�"�/)'#$���'�"����� ���� �! ���
��
��
���������B�;;��7������ �>C;;
���� ���� �*� ��� ���� �����?�*&@�����*&����*&�1����1���� B�CDC�;E�FF�E
����������� ������������/�������������CDC�>E�=F�DD�DD
����
*� ��� �3� ������� ���� ������� *�����������*���9���!���� B�CDC�>G�F;�F;�F;
��'>"#�������
�� �� �� ��� �:� �� 8%���� ��� ��������� � �������*� ��� ��������������� ����� ���� "��������� ����� � ������������������������
:� �� �-�� �(������� ��������� � �������*� ��� ��������������� ������ ������� ����� � ������������������������
$� ���� ��!���� ���#�� �$��
��� ����
��
������
Beyond Static Networks M Stumpf 1 of 13
Networks: Mapping Processes and UnderstandingGraphical models of gene regulatory networks
• Provides aconvenientrepresentation
• Numerousapproachesfor learning
• Large p, smalln
GaussianGraphical Models
Bayesian Networks
Dynamic Bayesian Networks
Biological networks Tom Thorne 3 / 35
Beyond Static Networks M Stumpf 1 of 13
Biology is Dynamic — Networks Change with Time
A
AP
AP
B
• Inferred regulatory network structures represent correlations rather thandirect interactions.
• Gene products may require activation and need to be transported into thenucleus to influence regulation; or complexes formed by signallingcascades may be required to activate transcription.
• Many factors that are not a part of a traditional regulatory network modelcan also influence regulatory interactions.
• These relationships may change depending on external signals or otherfactors.
Beyond Static Networks M Stumpf 2 of 13
Biology is Dynamic — Networks Change with Time
A AP
AP
B
• Inferred regulatory network structures represent correlations rather thandirect interactions.
• Gene products may require activation and need to be transported into thenucleus to influence regulation; or complexes formed by signallingcascades may be required to activate transcription.
• Many factors that are not a part of a traditional regulatory network modelcan also influence regulatory interactions.
• These relationships may change depending on external signals or otherfactors.
Beyond Static Networks M Stumpf 2 of 13
Biology is Dynamic — Networks Change with Time
A
AP
AP
B
• Inferred regulatory network structures represent correlations rather thandirect interactions.
• Gene products may require activation and need to be transported into thenucleus to influence regulation; or complexes formed by signallingcascades may be required to activate transcription.
• Many factors that are not a part of a traditional regulatory network modelcan also influence regulatory interactions.
• These relationships may change depending on external signals or otherfactors.
Beyond Static Networks M Stumpf 2 of 13
Biology is Dynamic — Networks Change with Time
A
AP
AP
B
• Inferred regulatory network structures represent correlations rather thandirect interactions.
• Gene products may require activation and need to be transported into thenucleus to influence regulation; or complexes formed by signallingcascades may be required to activate transcription.
• Many factors that are not a part of a traditional regulatory network modelcan also influence regulatory interactions.
• These relationships may change depending on external signals or otherfactors.
Beyond Static Networks M Stumpf 2 of 13
Capturing Biological Dynamics — ChangepointModels for Networks
• We can include hidden factors that my change the regulatory interactions takingplace in our model by allowing the regulatory network structure to vary betweentimepoints and/or conditions.
• In changepoint models the time series is divided into a number of segments,allowing a different network structure in each.
• Using Bayesian inference it is possible to infer the posterior distribution ofchangepoint positions.
Time point 1 2 3 4 5 6 7 8 9 10
S. Lebre, J. Becq, F. Devaux, M. P. H. Stumpf, G. Lelandais, Statistical inference of the time-varying structure ofgene-regulation networks. BMC Systems Biology, 4:130, 2010.
Beyond Static Networks M Stumpf 3 of 13
Capturing Biological Dynamics — ChangepointModels for Networks
• We can include hidden factors that my change the regulatory interactions takingplace in our model by allowing the regulatory network structure to vary betweentimepoints and/or conditions.
• In changepoint models the time series is divided into a number of segments,allowing a different network structure in each.
• Using Bayesian inference it is possible to infer the posterior distribution ofchangepoint positions.
Time point 1 2 3 4 5 6 7 8 9 10
S. Lebre, J. Becq, F. Devaux, M. P. H. Stumpf, G. Lelandais, Statistical inference of the time-varying structure ofgene-regulation networks. BMC Systems Biology, 4:130, 2010.
Beyond Static Networks M Stumpf 3 of 13
Modelling Gene Expression Networks
Given gene expression time series data over m genes at n timepoints, we denote the observations as the n ×m matrix
X = (x1, . . . , xn)T ,
where xj = (xj1, . . . , xjm)T , the column vector of expression levels for
each of the m genes at time point j .
We formulate our model as a Hierarchical Dirichlet Process HiddenMarkov Model, a stochastic process whereby a set of hidden statess1, . . . , sn govern the parameters of some emission distribution Fover a sequence of time points 1 . . . n.Each observation xj is then generated from a corresponding emissiondistribution F (θk), where sj = k . For our emission distributions, F , weuse a Bayesian Network model over the m variables to represent theregulatory network structures corresponding to each hidden state.
Beyond Static Networks M Stumpf 4 of 13
Bayesian Networks
Conditional probability distribution represented by a Directed AcyclicGraph (DAG)
X1 X2
X3
X4 X5
X6
P(X1, . . . ,X6) =P(X1)P(X2)P(X3|X1,X2)P(X4|X3)P(X5|X3)P(X6|X4,X5)
Order sampling:
p(@) ∼∏
u∈NG
∑pa@G (u)
p(u, paG)
Beyond Static Networks M Stumpf 5 of 13
What We Want to Know is Often Not Measured:Hidden Markov Models
• Here we measure transcriptomic data, whereas the action is alldue to proteins and their interactions among themselves and withDNA/RNA.
• We measure mRNA expression (yi ), which is influenced by anetwork (si ) that is not or cannot be observed directly.
• The transition probability between different states (networks) isgiven by πkl = p(sj = l |sj−1 = k).
s1
y1
θs1
πs1
s2
y2
θs2
s3
y3
θs3
. . . sT
. . . yT
θsT
Beyond Static Networks M Stumpf 6 of 13
What We Want to Know is Often Not Measured:Hidden Markov Models
• Here we measure transcriptomic data, whereas the action is alldue to proteins and their interactions among themselves and withDNA/RNA.
• We measure mRNA expression (yi ), which is influenced by anetwork (si ) that is not or cannot be observed directly.
• The transition probability between different states (networks) isgiven by πkl = p(sj = l |sj−1 = k).
s1
y1
θs1
πs1
s2
y2
θs2
s3
y3
θs3
. . . sT
. . . yT
θsT
Beyond Static Networks M Stumpf 6 of 13
What We Want to Know is Often Not Measured:Hidden Markov Models
• Here we measure transcriptomic data, whereas the action is alldue to proteins and their interactions among themselves and withDNA/RNA.
• We measure mRNA expression (yi ), which is influenced by anetwork (si ) that is not or cannot be observed directly.
• The transition probability between different states (networks) isgiven by πkl = p(sj = l |sj−1 = k).
s1
y1
θs1
πs1
s2
y2
θs2
s3
y3
θs3
. . . sT
. . . yT
θsT
Beyond Static Networks M Stumpf 6 of 13
The Chinese Restaurant Process
. . .
θ1 θ2 θ3 θ4
H
θ5
Analogy for the Dirichlet process due to Pitman and Dubins
D. Aldous, Exchangeability and Related Topics. In l’Ecole d’ete de probabilites de Saint-Flour, XIII, pages 1-198. 1983
Beyond Static Networks M Stumpf 7 of 13
The Chinese Restaurant Process
. . .
θ1 θ2 θ3 θ4
H
θ5
Analogy for the Dirichlet process due to Pitman and Dubins
D. Aldous, Exchangeability and Related Topics. In l’Ecole d’ete de probabilites de Saint-Flour, XIII, pages 1-198. 1983
Beyond Static Networks M Stumpf 7 of 13
The Chinese Restaurant Process
. . .
θ1 θ2 θ3 θ4
H
θ5
Analogy for the Dirichlet process due to Pitman and Dubins
D. Aldous, Exchangeability and Related Topics. In l’Ecole d’ete de probabilites de Saint-Flour, XIII, pages 1-198. 1983
Beyond Static Networks M Stumpf 7 of 13
The Chinese Restaurant Process
. . .
θ1 θ2 θ3 θ4
H
θ5
Analogy for the Dirichlet process due to Pitman and Dubins
D. Aldous, Exchangeability and Related Topics. In l’Ecole d’ete de probabilites de Saint-Flour, XIII, pages 1-198. 1983
Beyond Static Networks M Stumpf 7 of 13
The Chinese Restaurant Process
. . .
θ1 θ2 θ3 θ4
H
θ5
Analogy for the Dirichlet process due to Pitman and Dubins
D. Aldous, Exchangeability and Related Topics. In l’Ecole d’ete de probabilites de Saint-Flour, XIII, pages 1-198. 1983
Beyond Static Networks M Stumpf 7 of 13
Systems at Different Times are Related: TheChinese Restaurant Franchise
α
θ2 θ1 θ1 θ3 θ2 θ2
θ1 θ2 θ3 θ ′ ∼ H
γ
Beyond Static Networks M Stumpf 8 of 13
Systems at Different Times are Related: TheChinese Restaurant Franchise
α
θ2 θ1 θ1 θ3 θ2 θ2
θ1 θ2 θ3 θ ′ ∼ H
γ
Beyond Static Networks M Stumpf 8 of 13
Systems at Different Times are Related: TheChinese Restaurant Franchise
α
θ2 θ1 θ1 θ3 θ2 θ2
θ1 θ2 θ3 θ ′ ∼ H
γ
Beyond Static Networks M Stumpf 8 of 13
Each hidden state k possesses a Dirichlet Process Gk from which πk ·is drawn, and a common (discrete) base measure G0 is sharedbetween the Gk , so that
Gk ∼ DP(α,G0).
Thus transitions are made into a discrete set of states shared acrossall of the Gk , and drawn from G0. The base measure G0 is in turndrawn from a Dirichlet Process,
G0 ∼ DP(γ,H) H is the prior for the emission distributions Fk .
Using the stick breaking construction for G0 and drawing θl ∼ H, wehave
G0 =
∞∑l
βlδθl with β ∼ GEM(γ),
and so
Gk =
∞∑l
πklδθl with πk ∼ DP(α,β).
Beyond Static Networks M Stumpf 9 of 13
Biological Systems do Not Change Wildly(Assumption!): Hidden States are Correlated
s1 s2 s3 s4 s5 s6 s7 s8 s9
Observations y1 y2 y3 y4 y5 y6 y7 y8 y9
Time point 1 2 3 4 5 6 7 8 9
Beyond Static Networks M Stumpf 10 of 13
Systems at Different Times are Related: HDP-HMM
H θk
γ β
α πk·
s0 s1 s2 sn
y1 y2 yn
∞
∞• Base measure H• Shared state
distribution β• Transition
distributions πi,·
• State sequences0, . . . , sn
• Observationsy1, . . . , yn
Each si is a Bayesian network and H specifies the prior over the parameters for theemission distribution.
Beyond Static Networks M Stumpf 11 of 13
Systems at Different Times are Related: HDP-HMM
To sample from the hidden state sequence we have used a Gibbssampling procedure based on the conditional probabilities for thehidden state si given the remaining hidden states s−i , updating eachhidden state individually in a sweep over the n states,
p(sj = k |s−j ,α,β, κ) ∝(N−j
sj−1k + αβk + κδsj−1(k))N−jksj+1
+ αβsj+1 + κδsj+1(k) + δsj−1(k)δsj+1(k)
α+ N−jk· + κ+ δsj−1(k)
p(X j·|X i· : si = k , i 6= j,Fk), (1)
where N−jkl indicates the number of observed transitions from state k
to state l within the hidden state sequence s−j , and N−jk· the total
number of transitions from state k within s−j .Beyond Static Networks M Stumpf 11 of 13
D. melanogaster development
Expression data for D. melanogaster midgut development. Taken at11 time points, during which larval midgut becomes adult midgut.
Beyond Static Networks M Stumpf 12 of 13
A. thaliana diurnal cycle
Expression data for A. thaliana over 24 hours, with a light and darkphase.
Beyond Static Networks M Stumpf 12 of 13
The S. cerevisae Cell Cycle
Expression data for S. cerevisae over two cell cycles, at 25 timepoints.
1 2
3 4
Fre
quen
cy
0 10 20 30 40 50 60 70 80 90 105 1200.
00.
20.
40.
60.
81.
0
T. Pramila, W. Wu, S. Miles, W.S. Noble et al., The Forkhead transcription factor Hcm1 regulates chromosome segregationgenes and fills the S-phase gap in the transcriptional circuitry of the cell cycle. Genes Dev Aug 15;20(16):2266-78, 2006.
Beyond Static Networks M Stumpf 12 of 13
Candida glabrata osmotic stress response
SPT16
FPS1
EMC6
SMX3
ISD11
MKS1
CAGL0K04235g
FPS1
VMA22 SRB8
SMX3
CAGL0H00704g
ISD11
BUD31 CAGL0K06127g
YJR085C
CUE2
2
1Time point (mins)
Fre
quen
cy
0.0
0.2
0.4
0.6
0.8
1.0
15 30 60 90 120 150 180 240
Two distinct regulatoryarchitectures appearto control theexpression of thegenes involved inosmotic stressresponse in C.glabrata.
TemporalDependencies
T<30min:
ISD11→ SMX3
T>30min:
ISD11→ BUD31
SMX3→ BUD31
Interactions change withtime and may becontingent on pastinteractions.
Beyond Static Networks M Stumpf 12 of 13
Candida glabrata osmotic stress response
SPT16
FPS1
EMC6
SMX3
ISD11
MKS1
CAGL0K04235g
FPS1
VMA22 SRB8
SMX3
CAGL0H00704g
ISD11
BUD31 CAGL0K06127g
YJR085C
CUE2
2
1Time point (mins)
Fre
quen
cy
0.0
0.2
0.4
0.6
0.8
1.0
15 30 60 90 120 150 180 240
Two distinct regulatoryarchitectures appearto control theexpression of thegenes involved inosmotic stressresponse in C.glabrata.
TemporalDependencies
T<30min:
ISD11→ SMX3
T>30min:
ISD11→ BUD31
SMX3→ BUD31
Interactions change withtime and may becontingent on pastinteractions.
Beyond Static Networks M Stumpf 12 of 13
Acknowledgements
Imperial College London
• ThomasThorne
• JustinaZurauskine
• Paul Kirk• Daniel Silk
Thorne & Stumpf, Bioinformatics, 2012, 28:3298Thorneet al., MolBiosystems, 2013, 9:1736-1742
Exter University• Andrew McDonagh• Melanie Puttnam• Lauren Ames• Ken Haynes
Beyond Static Networks M Stumpf 13 of 13
Recommended