Visualization hang zhong

1

Visualization of Ciona Intestinalis

Co-‐expression Network

by

Hang Zhong

A dissertation submitted in partial fulfillment

of the requirements for the degree of

Master of Science

Department of Biology

New York University

May, 2012

2

ACKNOWLEDGEMENTS

I would like to thank my advisor, Richard Bonneau, for

providing me the opportunity to participate in this project, ongoing

guidance and support. I am also indebted to professor Lionel

Christiaen for inspiring the project. This thesis could not have come to

fruition without the help of Florian Razy, who offered insightful and

thought-‐provoking input.

I am also everlastingly grateful to Duncan Penfold-‐Brown for

teaching me the programming. I would also like to thank Kieran Mace,

Aviv Madar, Kevin Drew, Maximilian Haeussler and Claudia Racioppi

who so patiently offer their time and support. Many thanks to Todd

Heiniger and Joel Rodriguez for revising the thesis.

Finally, I would like to thank my family for the invaluable

support they have given me in the course of my life and studies.

3

ABSTRACT

The abnormalities of the heart development causes most

frequent congenital diseases in humans. The conservation of the Gene

Regulatory Network (GRN) involved in heart development, cellular

simplicity, low genetic redundancy and relevant evolutionary position

lead researchers to study the ascidian Ciona intestinalis. To extract

useful information from the Microarray data for researchers to infer

the heart network in Ciona, this thesis not only applies the standard-‐

based approaches to find the differential expression genes, but also

explores the network-‐based approaches to find functional group. By

visualizing the co-‐expression network in Gaggle, the list of ASM and

heart candidate genes are fine-‐tuned. In addition, the modules

containing candiate and known marker genes may deserve further

study.

4

TABLE OF CONTENTS

ABSTRACT .................................................................................................................................. 3

1. INTRODUCTION ............................................................................................................... 7

1.1 GENE REGULATORY NETWORK OF CARDIOGENIC PRECURSORS IN CIONA ............................... 7

1.2 MICROARRAY DATA ANALYSIS ............................................................................................... 8

1.3 NETWORK VISUALIZATION THROUGH GAGGLE ....................................................................... 9

2. METHODS ........................................................................................................................ 10

2.1 MICROARRAY EXPERIMENTAL DESIGN ................................................................................ 10

2.2 GENE EXPRESSION DATA .................................................................................................... 10

2.2.1 QUALITY CONTROL ........................................................................................................................ 10

2.2.2 PREPROCESSING ............................................................................................................................ 11

2.3 STATISTICAL TEST .............................................................................................................. 11

2.4 CLUSTER ANALYSIS ............................................................................................................ 11

2.5 FUNCTIONAL ENRICHMENT ANALYSIS ................................................................................ 12

2.6 GENERATION OF NETWORKS .............................................................................................. 12

2.6.1 STRING PROTEIN NETWORK ........................................................................................................ 12

2.6.2 UNWEIGHTED CO-‐EXPRESSION NETWORK ................................................................................ 13

2.6.3 WEIGHTED CO-‐EXPRESSION NETWORK ..................................................................................... 13

2.7 NETWORK VISUALIZATION ................................................................................................. 14

2.7.1 FILE FORMAT ................................................................................................................................. 14

2.7.2 ANALYZING NETWORK BY PLUGIN IN CYTOSCAPE .................................................................... 14

3. RESULTS .......................................................................................................................... 15

3.1 DIFFERENTIAL EXPRESSION ............................................................................................... 15

3.1.1 EXPECTATION OF THE MICROARRAY DATA ................................................................................ 15

3.1.2 ASM AND HEART CANDIDATE GENES .......................................................................................... 15

3.2 NETWORK VISUALIZATION IN GAGGLE ............................................................................... 17

5

3.2.1 NETWORKS ..................................................................................................................................... 17

3.2.2 FINDINGS FROM THE NETWORK VISUALIZATION IN GAGGLE .................................................. 20

3.2.2.1 GAGGLE AS INFORMATION INTEGRATION CENTER ............................................................... 20

3.2.2.2 MODULE FROM ALLEGROMCODE ............................................................................................. 21

3.2.2.3 MODULE FROM WEIGHTED NETWORK .................................................................................... 22

3.2.2.4 FINE-‐TUNED LIST ...................................................................................................................... 23

4. DISCUSSION .................................................................................................................... 25

4.1 ASM CANDIDATE GENES ...................................................................................................... 25

4.2 ANNOTATION IN CIONA INTESTINALIS ................................................................................ 25

4.3 FUNCTIONAL RIBOSOME GROUP AND COE ........................................................................... 26

4.4 TIME-‐SERIES ...................................................................................................................... 27

4.5 LIMITATIONS OF THE CO-‐EXPRESSION NETWORK ............................................................... 28

FIGURES AND TABLES ......................................................................................................... 29

FIGURE 1 PIPELINE. ................................................................................................................... 29

FIGURE 2 NORMALIZED UNSCALED STANDARD ERROR (NUSE). ................................................. 30

FIGURE 3 HEAT-‐MAP OF ASM AND HEART CANDIDATE GENES. ................................................... 30

FIGURE 4 OUTPUT OF THE SHORT TIME-‐SERIES EXPRESSION MINER. ........................................ 31

FIGURE 5 SELECTING SOFT POWER. ........................................................................................... 31

FIGURE 6 CIONA INTESTINALIS WEIGHTED CO-‐EXPRESSION NETWORK. .................................... 32

FIGURE 7 MODULE SIGNIFICANCE. ............................................................................................. 33

FIGURE 8 INTRAMODULAR CONNECTIVITY AND MODULE SIGNIFICANCE. ................................... 34

FIGURE 9 STRING PROTEIN NETWORK. ..................................................................................... 35

FIGURE 10 LABELING IN WEIGHTED NETWORK. ........................................................................ 35

FIGURE 11 THE 1ST MODULE INFERRED BY ALLEGROMCODE FOR UNWEIGHTED CO-‐EXPRESSION

NETWORK. 36

FIGURE 12 THE 1ST MODULE OF UNWEIGHTED CO-‐EXPRESSION NETWORK ENRICHMENT. ......... 37

FIGURE 13 THE 1ST MODULE INFERRED BY ALLEGROMCODE FOR WEIGHTED CO-‐EXPRESSION

NETWORK. 37

6

FIGURE 14 THE 1ST MODULE OF WEIGHTED NETWORK ENRICHMENT. ....................................... 37

FIGURE 15 RIBOSOME GROUP IN THE STRING. ........................................................................... 38

FIGURE 16 RIBOSOME GROUP IN STRING NETWORK ENRICHMENT. ............................................ 38

FIGURE 17 RIBOSOME GROUP AND COE. .................................................................................... 39

FIGURE 18 GREY COLOR GENES. ................................................................................................ 39

FIGURE 19 TAN MODULE ........................................................................................................... 40

FIGURE 20 BROWN MODULE ..................................................................................................... 40

FIGURE 21 TURQUOISE MODULE ENRICHMENT. ......................................................................... 41

FIGURE 22 GENES IN TURQUOISE PLUS STEM CONDITION. ........................................................ 41

FIGURE 23 GENES OF TURQUOISE PLUS STEM CONDITION ENRICHMENT. ................................... 42

FIGURE 24 SUB-‐GROUP OF CANDIDATE GENES IN UNWEIGHTED NETWORK. .............................. 42

FIGURE 25 SUB-‐GROUP OF CANDIDATE GENES IN UNWEIGHTED NETWORK ENRICHMENT. ........ 43

FIGURE 26 ASM CANDIDATE GENES IN WEIGHTED NETWORK ENRICHMENT. ............................. 43

FIGURE 27 ASM AND HEART CANDIDATE GENES ........................................................................ 44

REFERENCES ........................................................................................................................... 45

7

1. INTRODUCTION

1.1 Gene regulatory network of cardiogenic precursors in Ciona

The abnormalities of the heart development causes most

frequent congenital diseases in humans. The conservation of the Gene

Regulatory Network (GRN) involved in heart development, cellular

simplicity, low genetic redundancy and relevant evolutionary position

lead researchers to study the ascidian Ciona intestinalis(Davidson

2007). In Ciona, a single pair of blastomeres called B7.5 gives birth to

the anterior tail muscle (ATM) and to the trunk ventral cells (TVC)

(Figure 27). Following migration from the tail, the TVC undergo

asymmetric cell divisions at the ventral midline of the trunk. The

medial TVC give rise to the heart while the lateral TVCs migrate

toward the atrial placode where they will form the atrial siphon

muscles (ASM). Thus, the TVC are similar to the multipotent cardio-‐

pharyngeal progenitors found in vertebrates, while ASM are likely

equivalent to the jaw muscle in vertebrates.

A few years ago, the first cardiogenic the Gene Regulatory

Network (GRN) in Ciona was proposed (Christiaen, Davidson et al.

2008), decoupling genes necessary for heart specification from genes

necessary for cell migration. Later study has been shown that ASM

precursors express the transcription factor COE (Stolfi, Gainous et al.

8

2010), which is necessary and sufficient to specify ASM fate.

Misexpression of COE in the whole TVC lineage blocks heart

development and imposes an ASM fate to all cells. Conversely,

misexpression of a constitutive repressor form of COE provokes the

opposite phenotype, blocking ASM formation and causing all cells to

form heart tissue. Using the genome-‐wide Microarray analysis to

study this crucial COE gene and find the downstream effectors of COE,

it is expected to gain insights to the gene regulatory network of the

heart.

1.2 Microarray data analysis

Most of the existing studies have focused on the differential

expression to identify genes that distinguish different sets of samples.

It’s quite common to apply different testing method, such as t-‐test, F-‐

test, or nonparametric versions of the Wilcoxon test to rank

thousands of genes, and the most significant genes are select

(Gentleman 2005). Other specific statistical methods are also

commonly used in the Microarray data analysis, such as Significance

Analysis of Microarray (SAM) (Tusher, Tibshirani et al. 2001) and

LIMMA (Wettenhall, Smyth 2004) using a Bayesian mixture model.

Another way of using microarray data is to understand an

individual gene or protein’s network properties by studying the co-‐

expression, where genes that have similar expression patterns across

a set of samples are hypothesized to have a functional relationship.

9

This co-‐expression network-‐based approach is consistent with the

important concept that has emerged over the past decade—genes and

their protein products carry out cellular processes in the context of

functional modules and are related (Barabasi, Bonabeau 2003,

Barabasi, Oltvai 2004).

1.3 Network visualization through Gaggle

It has been well recognized that visualization plays a key role in

helping to understand biological systems, particularly in the era of

high-‐throughput studies with a wealth of ‘omics’-‐scale data

(Gehlenborg, O'Donoghue et al. 2010). This thesis applies the simple,

open-‐source Java software system Gaggle (Shannon, Reiss et al. 2006)

for co-‐expression network visualization. Gaggle is a cross-‐platform

system integrated with diverse databases (KEGG, BioCyc, and String)

and software (Cytoscape, DataMatrixViewer, R statistical

environment, and TIGR Microarray Expression Viewer). With four

simple data types (names, matrices, networks, and associative arrays),

researchers can explore many different sources and variety of

software tools by entering these information into the Gaggle Boss and

transferred to other tools.

10

2. METHODS

The pipeline of this thesis is in Figure 1.

2.1 Microarray experimental design

The microarray data used in this study are kindly provided by

Dr. Lionel Christiaen. It consists of 30,969 probe sets from Affymetrix

GeneChips. The perturbation group includes LacZ control, the over-‐

expression and loss of function of transcription factor Collier/EBF/OIf

(COE) in the sorted TVC cells at 21 hours post fertilization (hpf)—

after the asymmetric divisions of the TVCs but before completion of

the ASM migration. Time-‐series group is comprised of 11 time points,

every 2 hours varying from 8 to 28 hours in TVC cells.

2.2 Gene expression data

2.2.1 Quality control

This thesis applies the arrayQualityMetrics (Kauffmann,

Gentleman et al. 2009), a Bioconductor package for quality control. It

provides an HTML report with several diagnostics plots. In general,

the array will be discarded if it is identified as an outlier in both

before and after normalization in the report.

The Microarray data firstly is imported in statistical

programming language R, and then carried on the quality control by

arrayQualityMetrics. The sample LacZ.3 is removed since it was

11

reported an outlier in both before and after normalization (Figure 2).

2.2.2 Preprocessing

The cell files of the Microarray are normalized by the RMA

method (Gentleman 2005). The expression matrix contains 30,969

probes and 48 arrays. After the non-‐specific filtering by variance

(IQR=0.5), the matrix contains 15,484 probes, 48 arrays.

Using the collapseRows function in WGCNA, the probes with

maximum variance are selected to represent genes. After merging the

probes, the merged matrix contains 10,079 probes and 48 arrays.

2.3 Statistical test

The merged matrix is ranked by moderated F test and genes

are selected with significant p-‐value (<0.05, using Limma package)

(Smyth 2004) after adjusted by Benjamini-‐Hochnerg method. After

ranking, the top-‐rank matrix contains 4,307 probes and 48 arrays.

The top-‐rank matrix is imported to one of the Gaggle Geese

MultiExperiment Viewer (MeV) and under Significant Analysis for

Microarrays (SAM) test (COE versus COEW group, p-‐value < 0.05,

1000 permutation, FDR = 0.9).

2.4 Cluster analysis

12

Hierarchical clustering is performed for ASM and Heart

candidate genes using MeV, using Pearson correlation metric and

average linkage clustering.

The time-‐series group data, totaling 36 arrays, are averaged for

each time point and imported to Short Time-‐series Expression Miner

(STEM), using STEM Clustering Method.

2.5 Functional enrichment analysis

Blast2GO (B2G) (Conesa, Gtz et al. 2005) is a comprehensive

bioinformatics tool for annotation, visualization and analysis in

functional genomics research. It offers a suitable platform for

functional research in non-‐model species, such as Ciona intestinalis.

DNA sequences in fasta format were loaded to Blast2GO.

15,629 genes remained in the Blast2GO, followed by blasting, go-‐

mapping and yielded Go-‐terms for 3,964 genes. The test group from

different lists is tested against the reference group (3,964 genes)

using the Fisher’s Exact Test (p-‐value < 0.05, FDR correction).

2.6 Generation of networks

2.6.1 String protein network

Using the Ensembl gene name in this filt.gene matrix as input,

the genes of interest in the Search Tool for the Retrieval of Interacting

Genes (STRING) database (Szklarczyk, Franceschini et al. 2011) are

extracted from the STRING website in Text Summary format and

13

parsed to Cystoscape simple interaction format (SIF) (Shannon,

Markiel et al. 2003) by python programming language.

2.6.2 Unweighted co-‐expression network

The Pearson Correlation Coefficient for all pair-‐wise

comparisons of genes is calculated from filt.gene matrix in R. High

correlated genes are selected with cutoff 0.9 and parsed to simple

interaction format (SIF) (Shannon, Markiel et al. 2003) by python.

2.6.3 Weighted co-‐expression network

2.6.3.1 Network construction

The procedure can be found in the WGCNA website (Horvath

2011).

2.6.3.2 Module detection

Pearson correlation coefficients are calculated for all pair-‐wise

comparisons of genes across all samples. The resulting Pearson

correlation matrix is transformed into the weighted adjacency matrix

with the above power beta 6. The average linkage hierarchical

clustering is used to group genes on the basis of the topological

overlap dissimilarity measure of their network connection strengths

(Zhang, Horvath 2005). Using a dynamic tree-‐cutting algorithm

(Langfelder, Zhang et al. 2008), 13 modules are found with the minimum

cluster size of 70 (Figure 6). Genes that are not assigned to modules

are assigned the color grey.

14

2.6.3.3 Module significance

The p value of moderated t test is the output from topTable of

AffylmGUI package in R (Smyth 2004).

2.7 Network visualization

2.7.1 File format

The output files from WGCNA are parsed to simple interaction

format (SIF) (Shannon, Markiel et al. 2003) by python.

2.7.2 Analyzing network by plugin in Cytoscape

AllegroMCODE and Network Analysis plugin in Cytoscape are

used to analyze the network. Finding the cluster automatically is

achieved by AllegroMCODE.

15

3. RESULTS

3.1 Differential expression

3.1.1 Expectation of the Microarray data

Genes that are up-‐regulated in the overexpression of COE or

down-‐regulated in loss of function of COE are considered ASM

candidate genes downstream of COE, while genes that are down-‐

regulated in overexpression of COE or up-‐regulated in loss of function

of COE are considered Heart candidate genes repressed by COE (Stolfi,

Gainous et al. 2010).

Using the COE and COEW group as two classes in the

Significant Analysis for Microarrays (SAM), the contrast would yield

ASM and Heart candidate genes.

3.1.2 ASM and Heart candidate genes

3.1.2.1 Lists from SAM

336 significant genes are derived from SAM and separated into

206 ASM candidate genes (negative in SAM, expression of COE group

lower than that of COEW group) and 130 Heart candidate genes

(positive in SAM, expression of COE group higher than that of COEW

group). These two groups can be distinguished by the first three

columns in the heat-‐map (Figure 3, Figure 27).

16

Based on the Hierarchical Clustering and observation, the ASM

candidate genes can be roughly divided into three large groups:

A1. The first group (up-‐down-‐up-‐ASM, 61 genes), shows a “U”

shape curve through the time-‐series experiments, with the earliest

up-‐regulation right at the experimental time point of 8 hours. This

group contains Snail (‘SNAIL’ in the thesis), SET and MYND Domain 1

(SMYD1) and Myodblast determination protein (Myod, ‘MYOD’ in the

thesis).

A2. The second group (early-‐ASM, 45 genes), including COE

and Myocyte Regulatory Light Chain (MRLC5, ‘MYL5’ in the thesis)

gene, shows early up-‐regulation around 14 hours.

A3. The third group (late-‐ASM, 100 genes) has relatively late

up-‐regulation after 18 hours, with myosin heavy chain genes (MHC3),

tropomyosin 1(TPM1, ‘CTM1’ in the thesis) and muscle like actin 2

(MA2) in the group.

The Heart candidate genes can be divided into two large

groups:

H1. The first group (early-‐Heart, 99 genes) shows early up-‐

regulation (before 20 hours), containing heart markers BMP2/4, NK4,

NOTRLC/HAND-‐LIKE, and ETS/POINTED2.

17

H2. The second group (late-‐Heart, 31 genes) displays relative

late up-‐regulation (after 20 hours), with mesenchyme specific gene 3

(MECH3) in the group.

As expected, two lists of genes have some important markers

in them and noticeable temporal expression. But these ASM and Heart

candidate genes didn’t show Go-‐term enrichment from the Blast2GO,

which might indicate the need to fine-‐tune the list, even though the

Blast2GO with few go terms is another concern. Further improvement

of the ASM and Heart candidate gene list would be necessary to know

the effect of the non-‐specific filtering, selecting the probe for a gene by

maximum variance and SAM ranking.

3.1.2.2 Clusters from STEM

Total 7 significant model profiles showed in the STEM output.

23 out of the 206 ASM candidate genes are in the significant profiles.

Most of them are in the profile 20, similar to the late-‐ASM, including

the MHC3, MA2 and MYL5 genes. For the Heart candidate genes, 13

out of 130 are in the significant profiles.

3.2 Network Visualization in Gaggle

3.2.1 Networks

3.2.1.1 STRING protein network

The STRING (Szklarczyk, Franceschini et al. 2011) protein

network is created to make good use of the existing data resources. It

18

provides both experimental and predicted interaction information

from computational techniques, presented as different colors in the

edge (Figure 9).

3.2.1.2 Co-‐expression network

The network-‐based approaches, also termed graph-‐based

approaches, aim to extract recurrent expression patterns or

conserved module from the rapid accumulation of Microarray

datasets. The Microarray dataset is modeled as a relation graph where

each node represents one gene and two genes are connected through

the edge based on certain expression correlation parameter (Zhang,

Horvath 2005) to measure the similarity between expression profiles

(Pearson Correlation Coefficient is used in this thesis). The graph,

namely network, can be represented by an adjacency matrix that

encodes whether a pair of nodes is connected. For unweighted

networks, entries are 1 or 0. For weighted networks, the adjacency

matrix reports the connection strength for the gene pairs, between 1

and 0 (Zhang, Horvath 2005). The concept of connectivity in graph

theory, also termed degree, can be depicted as the row sum of the

adjacency matrix, measuring the direct neighbors of the node in the

unweighted networks and connection strengths in the weighted

network.

Two co-‐expression networks are generated in this thesis.

19

The unweighted co-‐expression network is formed by the genes

with the Pearson Correlation Coefficient higher than 0.9. A total 766

nodes are in this unweighted network with clustering coefficient

0.311 (output result from the Network Analysis plugin in Cytoscape,

measuring the cohesiveness of the neighborhood of a node).

The genes with the top 5000 strong weight are outputted to

build the weighted co-‐expression network (cutoff for the weight is

0.23), a total of 814 nodes, with clustering coefficient 0.728.

The unweighted network has more isolated clusters with only

2 nodes linked by 1 edge. The weighted network has greater density

with some hubs (high connectivity), and also contains colors in the

node for the different modules detected in the WGCNA.

Though these two networks are different in the adjacency

matrix, they are both based on Pearson Correlation Coefficient to

present the genes of high similarity in the graph in terms of their

closeness. In other words, genes of same expression profiles across all

of the experiments would be close to each other in the network. These

network-‐based approaches allow for the exploration of the position of

a biological entity in the context of its local neighborhood in the graph

and network as a whole, and less troubled by inherent noise that

confound conventional pairwise approaches (Freeman, Goldovsky et al.

2007).

20

3.2.2 Findings from the network visualization in Gaggle

3.2.2.1 Gaggle as information integration center

In this post-‐genomic era, biologists often face the challenge to

freely explore the experimental and computational data from many

different sources and diverse software tools, such as storing different

data for genes, retrieving data from a list of genes, and mapping one

list of genes with another. Once the network has been loaded in the

Cytoscape, Gaggle, as an information integration center, can help to

solve these problems with respect to Microarray data.

Storing different data for genes can be achieved by labeling. As

shown in the Figure 9 and 10, two networks present data from 6

different sources, such node color for module, node label for ASM or

Heart candidate genes, node shape for significance in moderated F

test, node size for connectivity, edge color for different interaction,

and distance between nodes for closeness. Therefore the network in

Cytoscape functions as a visual database.

Retrieving data from a list of genes, such as expression matrix,

is also feasible through the basic function “broadcast” in Gaggle. For

example, a list of genes of interest in the Cytoscape can be sent to the

Gaggle Boss, and then broadcast to Data Matrix Viewer (DMV), which

can output the expression matrix.

21

Mapping one list of genes with another can be done

conveniently in Gaggle thourhg the many functions that it offers. In

the MultiExperiment Viewer (MeV), a sub-‐list of genes can be

launched in a new viewer. In Cytoscape, the function “Create new

network from selected nodes” can be used in this task. Between

different tools, the function “broadcast” would serve as a bridge to

transfer the list and map it in the existing tools.

3.2.2.2 Module from AllegroMCODE

The main goal of the co-‐expression network visualization is to

find the highly correlated genes (module) related to the ASM or Heart

network, specifically aiming to infer targets of the transcription factor

COE.

In the unweighted network without predefined modules, the

modules can be automatically detected by AllegroMCODE, a plugin in

Cytoscape to find highly interconnected groups of nodes in a huge

complex network. The 1st module detected by AllegroMCODE for the

unweighted network is shown in the Figure 11. This module is

significantly enriched in biological process (Figure 12), such as

biosynthetic process and cellular biosynthetic process.

For the weighted network, the 1st module (Figure 13) detected

by AllegroMCODE contains largely turquoise module genes (only 1

22

grey color gene. This module is significantly enriched in intracellular

process (Figure 14).

Comparing these 1st modules of unweighed and weighted

network, they both contain ribosome related genes (gene name starts

with “RP”). Because these two networks are both generated from the

same Microarray data, an external reference would be necessary to

determine whether this ribosome group is found by chance. The

common list of 23 genes is from the comparison between the 1st

module in weighted network and all turquoise module genes in

STRING network, which has 16 ribosome related genes.

3.2.2.3 Module from weighted network

Weighted correlation analysis (WGCNA) has advantages in

identifying candidate targets with its unique mathematical features

(Langfelder, Horvath 2008). While the highly correlated genes can be

grouped into different modules, those genes that are far from the

modules are depicted in grey. Figure 18 shows that these grey color

genes in the weighted network are often with fewer edges and

targeted at miRNA, which are reasonably different from other

functional modules.

In Figure 7 and Figure 8, the tan and brown modules have

strong module significance (the significance is defined as –log10 (p-‐

value in moderated t test)). By visualizing these two modules from

23

their top 50 intramodular connectivity genes respectively, these

modules can be found enriched in the ASM and Heart candidate genes.

Interestingly, NK4 gene is in the tan module with other genes (Figure

19). Islet (ISL) gene, which is not in the candidate list yet reported to

be ASM gene, is in the brown module with some known markers, such

as MA2, MHC3, NOTRLC/HAND-‐LIKE, and ETS/POINTED2 (Figure

20). These results would be helpful to be a starting point for making

hypothesis of the Heart network in Ciona.

As the largest module in the weighted network, enriched in

cellular process and others (Figure 21), it is natural to consider

limiting the list of the turquoise module genes with other conditions.

The list of genes resulted from turquoise module and STEM condition

shows a clear temporal expression and enrichment in muscle and

heart related go-‐terms (Figure 22, Figure 23), while containing only

four genes found in the list.

3.2.2.4 Fine-‐tuned list

The network in Gaggle can serve as a visualization center as

well as a fine-‐tuning filter for a list of genes, because the network is

built upon the high correlated pair of genes with reduced noise. It is

by no means the genes that are not in the network that should be

discarded, but it is good to have expected go-‐term enrichment to

confirm the list. Because the go-‐term enrichment is related to the

24

proportion of genes with the same go-‐terms, the number of noisy

genes in the whole list would have a great impact on the enrichment.

Importing the candidate list to the co-‐expression network would

reduce the noise and yield better enrichment result.

By “broadcasting” function in the MeV, the Cytoscape can

receive and label the 336 significant genes in the unweighted network

with yellow color, and then create a sub-‐network for the candidate

genes. A subgroup of the candidate genes (Figure 24) is significantly

enriched in muscle and heart related go-‐terms (Figure 25), which

previously could not be reported from the Blast2GO. The ASM

candidate genes in the network are also enriched in muscle and heart

go-‐terms (Figure 26), while the Heart candidate genes in the network

are still not reported enrichment from the Blast2GO.

25

4. DISCUSSION

4.1 ASM candidate genes

COE is necessary and sufficient to specify ASM fate (Stolfi,

Gainous et al. 2010). It is understandable that COE expresses earlier

than the late-‐ASM genes (A3 group), such as MHC3, TPM1, MA2. While

for the up-‐down-‐up-‐ASM (A1 group), it has the earliest up-‐regulation,

with MYOD in the group. In Xenopus, the cross-‐regulatory interactions

of COE orthologs with genes of the Myogenic Regulatory Factor (MRF)

family, such as MYOD and MYF5, are crucial for muscle commitment

and differentiation (Green, Vetter 2011). However, how COE may

repress the cardiac fate and promote cell migration in Xenopus has

never been studied. A possible hypothesis is that in Ciona, the early

functions controlled by COE in ASM precursors are independent on

MRF activation since the MRF in the A1 group has earlier up-‐

regulation than COE in the A2 group.

And the A1 group genes are more likely to be TVC genes, which

also can explain the fact that there are heart related go-‐terms in the

enrichment of the ASM genes in the weighted network (Figure 26).

4.2 Annotation in Ciona intestinalis

The draft of genome sequence of the ascidian Ciona intestinalis

(Dehal, Satou et al. 2002) has been a valuable research resource.

26

However, there are numerous inconsistencies with the gene models

because of the intrinsic limitations in gene prediction programs and

the fragmented nature of the assembly (Satou, Mineta et al. 2008).

Therefore the annotation job for the probe in this study focuses on

combining available resources from various databases, such as

Aniseed (Tassy, Dauga et al.), Ensembl Genome Browser (Kersey,

Lawson et al. 2010), CIPRO (Endo, Ueno et al.), STRING (Szklarczyk,

Franceschini et al. 2011), UCSC Genome Browser (Karolchik, Hinrichs

et al. 2011), and also internal files from Dr. Lionel Christiaen’s lab.

There are 16,250 non-‐redundant genes in the 30,969 probes, which

will be the criteria to map a probe to a gene. It is unavoidable that

there are differences between the gene annotation in this thesis and

other sources.

4.3 Functional ribosome group and COE

The highly linked ribosome genes in the STRING network

(Figure 19), enriched in ribosome process (Figure 20), naturally lead

to a question—what is the relationship between this functional

ribosome group and COE. By broadcasting this list of ribosomes and

COE genes to MeV, the heat-‐map and expression plot show the

similarity in the time-‐series experiments of ribosome group and COE.

And this group of ribosome genes has quite a stable expression

profile. It is likely to find more housekeeping genes in the same

module as the ribosome group, which is not the focus of this thesis.

27

4.4 Time-‐series

Though the clustering algorithms, such as Hierarchical

clustering (Eisen, Spellman et al. 1998), K-‐means, and Self-‐organizing

Maps (SOM) (Tamayo, Slonim et al. 1999), can be used to analyze the

Microarray data and yield many biological insights, they are not

designed for time-‐series data since they assume that data at each time

point is collected independent of each other, and ignore the sequential

nature of time-‐series data (Ernst, Nau et al. 2005). This thesis applies

the Short Time-‐series Expression Miner (STEM) method to learn

about the time-‐series experiments with the hope of finding clues

about the true biological pattern, which is designed for the analysis of

short time series Microarray gene expression data (Ernst, Bar Joseph

2006). The algorithm (Ernst, Nau et al. 2005) of STEM starts by selecting

a set of potential expression profiles, covering the entire space of all

possible expression profiles that can be generated by the genes in the

experiment, and each represents a unique temporal expression

pattern. Next, each gene will be assigned to one of the profiles and

after the permutation resulting in different large clusters with

significant model profiles by greedy algorithm (Ernst, Nau et al. 2005),

which are colored in the top list in the user interface.

It is worth to mention that the STEM is designed for short time-‐

series (defined 3 – 8 time points in their website); while the time

points in this Microarray dataset is 11.

28

4.5 Limitations of the co-‐expression network

The co-‐expression network approaches have several limitations

including the following. First, the network similarity is based on the

Pearson Correlation Coefficient, which is sensitive to outliers.

Therefore the quality of the input matrix would be important to the

final result. It would be helpful to try the data transformation or use

Spearman’s rank correlation coefficient.

A second limitation is that the Pearson Correlation Coefficient

based co-‐expression network is more suitable for finding global co-‐

expression genes(Qian, Dolled Filhart et al. 2001), and it cannot

accurately detect the time-‐delayed or transient response of the down-‐

stream effectors for the time-‐series experiments. It would be better to

use local clustering (Qian, Dolled Filhart et al. 2001) to find the time-‐

delay or local co-‐expression genes, or other tools specialized in long

time-‐series experiments like The Graphical Query Language (GQL)

(Costa, Schnhuth et al. 2005).

A third limitation is that it is difficult to pick thresholds for a

biological network. The hard-‐threshold for the unweighted network

would arbitrarily cut off some biological meaningful edges. The weak

weight modules would also be cut off in the weighted network while it

is possible that this kind of weak linkage would be biologically

meaningful.

29

Figures and tables

Figure 1 Pipeline.

30

Figure 2 Normalized unscaled standard error (NUSE).

One of the tests in the arrayQualityMetrics, NUSE, detected sample

LacZ3 as an outlier.

Figure 3 Heat-‐map of ASM and Heart candidate genes.

ASM candidate genes are red in the first and third column. A1: up-‐

down-‐up-‐ASM. A2: early-‐ASM. A3: late-‐ASM. Heart candidate genes

are red in the second column. H1: early-‐Heart. H2: late-‐Heart.

31

Figure 4 Output of the Short Time-‐series Expression Miner.

Significant clusters are colored at the top row.

5 10 15 20

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Scale independence

Soft Threshold (power)

Scal

e Fr

ee T

opol

ogy

Mod

el F

it,si

gned

R^2

1

2

3 45 6 7 8 9 10 11 12 13 14 15 16 17 18

19 20

5 10 15 20

050

010

0015

00Mean connectivity

Soft Threshold (power)

Mea

n C

onne

ctiv

ity

1

2

3

45 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 5 Selecting soft power.

The soft threshold power beta of 6 is chosen for calculating the

adjacency matrix since it reached a high topology model fit (R^2) and

high mean connectivity.

32

Figure 6 Ciona intestinalis weighted co-‐expression network.

The dendrogram results from average linkage hierarchical clustering.

The color-‐band below the dendrogram denotes the modules, which

are defined as branches in the dendrogram. Of the 10, 079 genes,

6162 were clustered into 13 modules, and the remaining genes are

colored in grey.

33

black blue brown green greenyellow grey magenta pink purple red tan turquoise yellow

Dynamic−cutree Module Significance(COE−COEW modt) p= 3.1e−86

Dynamic Module

coes

ig0.

00.

20.

40.

60.

8

black blue brown green greenyellow grey magenta pink purple red tan turquoise yellow

Cou

nts

010

0020

0030

0040

00

Figure 7 Module significance.

Module significance is determined as the average absolute gene

significance (defined by minus log of a p-‐value) measure for all genes

in a given module.

34

●●●

●

●

●

●●●

●

●

● ● ●●● ●●●●

●

●● ●●

●●

●

●

●● ●● ●

● ●● ●●

●

●●● ●●

●

●● ●

●●

● ●● ●●●

●● ● ●● ●●

● ●

●

●●●

●

●●● ● ●●

●●

●●●

●

●

●

● ● ●●

●

●● ●●●●

●●●●

●

●●

●●

●●●●

●

●● ●

●●

●●

●●●

●

● ●●●●●

● ●

●

●●● ●

●

●●●● ●

●●

●

●

●●

●

●●

●

●●

● ●●● ●●●●●● ● ●●● ●

●● ●●● ●●●●

●

●●● ●●●

●

●● ●●●● ●

●●● ●●●●●

●●●

●

●●●●● ●●●●●●● ●

●

●●●

●●

●●●

●● ●● ●●

●●●●

●● ●● ● ●

●●

●

●●●●

● ● ●● ●● ●● ●●●● ●●●●●

●

●●

●●●● ●●● ● ●

●

● ● ●● ● ●●●●●●●●

●

●●●

●●

●

●● ●●●●

●●

●

●

●

● ●

●

●

●

●

●

●●●●●● ●●

●●

● ●● ●●

●

●●● ●

●

●●●

●

● ● ●●●

●●

●

●

●● ●●●

●

●●●●● ●●

●●●●

●●●

●

●●●

●

●●● ●●●

●●●● ●●

● ●●●●●●

●●●●●● ●

●● ● ●

●

●●●● ●●●

●

● ●●●●●

●

●●●●●●● ●

●

● ●●

●● ● ●●

●●●●

●●●●●●●

●

●●

●

●

●●●●●●

●

●●

●

●

●

●

●

●

●

● ●● ● ●● ●●● ●●●●

●

●●● ● ●●●

●●●

● ●●

●

●●● ●●●●

●●●●

●●● ●●● ●

●● ●●●

●

●

●

●

●

●● ●

●

●●●●●

● ●

●

●

●● ●

●

● ●●

●

● ●●●●

● ●●●

●●●●

●●

●

●●

●

●●●● ●●● ●

●

●● ●●●●●● ●●●

●●

●● ●●● ●

●●

●

●

● ●

●

●

●● ●

●

●●●

●●

●

●●

●

●

●

●●● ●●● ●● ●

●●●●

●●

●●

●● ●● ●●

●

●

●●●● ●

●●●●●

●●

●● ● ●●

●

●●● ● ●

●

●

●●●●●

●●●●● ●●● ●● ●

●

●●●●

●●● ●

●

●●

●●● ●●

●

●● ●●● ●

●●●

●●

● ●●●

●

●●●● ●●●

●● ●● ● ●

● ●● ●●

●●

●●

●●

●●●

●●

●

● ●●

●●●●●●

●●

●

●●

●

●●

●●

●●●●

● ● ●●

●●

●

●●● ●●●● ●● ●

●

●● ●●

●

●●

●●●●●●● ●● ●

●● ●

●

●

●●●●●● ●●●●●●●●

●●

● ●●● ●●●

● ●●●

● ● ●●●● ●●●● ●

●●

●●●

●

● ● ●● ●●● ●

●

●

●●●

●

●●●● ●

●

●●

●● ●●●

●

●

●●

●●

● ●●●● ●●

●

●

●●

●●

●●●

●

●●

●●

●●●

●

●● ●●● ●

●

●●

●●●

●

●● ●●●●

●●

●

●● ●●●

●

● ●●●

●●●●

●●●

●

● ●●●● ●

●

●●●●

●

● ●●●● ●●● ●

●

● ●● ●●●●

●

●

● ●●●

●●● ● ●●

● ●

●

●●●

●● ●

●

●● ●● ●●● ●● ●

●●

● ●●●●

●

●● ●●● ●● ●●

●●

●●

●

● ●●●

●●

●

●●

●

●

●● ●

●

●● ●●

●●

●

●●●●

●

●

●● ●● ●●

●●●●●

●●● ●●●●●●

●●●

●●●

●● ● ●●●●●

●

●●

●●●

●

●●

●

● ●●

●●

●●●●●

●

●● ●●●

●

●

● ●●●

●

●●

●●●

●● ●● ●●● ●● ●

●

●●

●

● ●●

●●

●●● ●

●

●●●

●

●●●

●●● ●

● ●●● ●●

●

●●

●●

●

●●●

●

● ●●● ●

●

● ●

●●

● ●

●●●

● ●

●●

● ●● ●●

●

● ●● ●●

●●●

●●

●●

●●● ●●

●●●

●●● ●

●●●●

●

●

●

●●●● ●

●

●●●●● ●●

●

●

●●

●● ●●●●

●

●●●●●

●● ●●●● ● ●●●

●

●●● ●●

●● ●

●●

● ●●● ●● ●●● ●

●

●● ●●●●●● ●

●●

●●●

● ● ●●●●●

●

●●●

●

●

●

● ●●● ●●

●

●●

●

●

● ●●●

●

●● ●

●

●●● ● ●● ●

●

●

●

●● ●

●

●●●●

●●● ● ●

●

●

●●●

●

●

●

●

●

●●●● ● ●●● ●●●●

●●●● ● ● ●

●●● ●

●●●●●●● ●

●

●

●● ●● ●

●

●●● ●● ●

●●

●● ●●●

●● ●●

●

●●

●

●

●

●●

●

●●●

● ●

●●

●●

●●●●

●●

●●●

●●●

●●●● ●●● ●

●

●●●

●

●

●●

● ●●

● ●

● ●●● ●

●●

●●● ●●● ●●

●● ●●●●● ●● ●●

●●

●●●

●●

●

●●●● ● ●●

●

●

●

●

●

● ● ●● ● ●●

●

● ●●●●●●

●● ●

●●●●

●●● ●●● ●

●●

●

●●●

●

●●●●

●●

●

●●

● ●●

●●● ●●

●

●● ●●

●●●●

●●●

● ● ●●

●●●●

●● ●●●

●●●●

●

●● ●● ●●●

●●

●●●●●● ●●●●●

●● ●

●

● ●● ●●● ●●

●

●● ●●● ●

●

● ●●●

●

●

● ●●● ●●

●●

●

●

●●● ●● ●

●● ●●●

● ●●●●●●●

●●● ●●●● ●●●● ●●●●● ●

●●● ●●●

●●●●●● ●● ●● ●●

●

●●●●●● ●

●

● ● ● ● ●●●

●

●

●

● ●●

●●

●

●

●

●

●

●●●● ●● ●

●

●

●●

●

●

●

●●●●●

●●

●

●●●

●

●

●

●●●

●●●●●●●● ●

●●

●

●● ●

●

● ●● ●●●●●●

●●

●

●

●● ● ● ●●●

●

●●

●

●●●

●● ●●

●

●●●

● ●●●●

●

●

●

●

●●

●

●●●

●●

●

●

●

●● ●●●●●●

●

●●

●●

●●

●●● ●● ● ●● ● ●●●

●

● ●● ●

●●● ●●

●● ●

● ●●● ●

● ●●●● ●● ●

●

● ●●●●

●●●●

●

● ●●●

● ●●●●●

●

●

●

●●

●●●

●

●●● ●●

●●●

●

● ●●

●● ●

●

●●●●●●

●●

●●

●

●●

●

●●

●●

● ●●●●●●

●

●● ●●●●●

●● ●● ● ●●●●

●●●● ●

●●●● ●

●

●

●●

●

●

●

●●●

●● ●● ●●

●●●●

●● ●●●

●●

●

●● ●●●●

●

●●

●

●

●●●

●

●●●

●

●● ●●●

●

●● ●●

●

●● ●●

●

●

●● ●●● ●●

●● ● ●● ●●●● ●●

●●

●

●●

●●● ●● ●●● ●

●

● ●● ●●●

●●●

●● ●● ●● ●●●● ● ●●● ●● ● ●●● ●

●●●● ●● ●

●

●●●●

●

●

●

●● ● ●●● ●●●●●●

●

●●●

●

●●●

●

●

●●●

● ● ●

●● ●

●

●

●●

●●

●

●

●●

●

●

●

●● ●●

●●

●

●● ●●● ● ●●

●●● ● ●●● ●● ●

●●

●● ●●

●●

●●●

●

●●●●● ●● ●●

●●

●● ● ●● ●

●

●●

●

●●● ●

●

●● ●●

●● ●●●●

●

●

●●

●

●● ●● ●●

●● ●

●● ●

●

●●

●●● ●●● ●

●

●● ●●

●

●● ●●●●

● ●●

●●●

●●●●

●

●●

● ●●●●

●

●● ● ●● ●●●

●

●●● ●

●

●●

●●

●

●●

● ●●●●

●

●

●

●●● ●● ●● ●●● ●●

●●●

●●● ●●●●●●

●● ●●

●●●

●

●●

●●

●

● ●●

●●●●●● ●

●●● ● ●●●● ●

●● ●● ●●

●●●

●

●●

●

● ●● ●●

●

●

●●

●●

●●

●●

●●●

●●

●●

●

●●●●●

●●

● ● ●

●

●●●

●●●● ●●

●

● ●●●●

● ●

●

●●● ●●

●●●

●

●

●

●

●● ●

●

●●

● ●

●

●●●

●

●●● ●●

●●

●

●●●● ●●

●

● ● ●●● ●

●● ●●●

● ●

●

●●● ●● ●● ●●

●

●

●

●●●●●●

●

● ●●●●● ●

●●●

●●● ●● ●

●

●● ●

●

● ●●●●● ● ●●● ●●●● ●

●●

●

● ●●

●●●● ●●●● ●●

●

●

●● ●●●

●

●●● ●● ● ● ●

●

●

●

●●● ●●

●●

●●

● ● ●● ●●●

●●●

●

● ●●● ●●●●

●●●

● ● ●●

● ●●●

●●●● ● ●

●● ●

●

● ●● ●● ● ●

●●● ●

●●● ● ●● ●

●

●●●●●●●●●●

●

●● ●

●

●

●● ●●● ●● ●●

●

●●● ● ●●

●

●● ●● ●●

●

●

●

●●

●

●●

●

● ●●●

● ●● ●●●

●

●

●

● ●●

●●●● ●●●● ●●●●

●●

●

●●● ●

●

●●●●

●● ●●●

●

●● ●●

●●●● ●

●

●

●

● ●●

●

●●

●●

● ●●

●● ●● ●●

● ●

●

●●●

●

●●

●

●

● ●●● ●●●

● ●●●●

●●

●

●●●●●●

●

●

●

● ●●● ●●

●●●

●

●● ●●● ●●

●

●●●●

●●●

●

●

● ●●● ●●● ●●● ●●● ●●●●●●●●

●

●●● ●●●

●●●● ●●● ●

●●●

●

● ●●

●

● ●

●

●● ●●

●

●

●

●●

●● ●● ●● ●●●

●

●●●●● ● ●● ●●●●

●● ●● ●

●●

● ●●●●●● ●

●●●●

●

● ●●● ●●

●

●● ●

●

●● ●●●●

●

●

●●● ●●

● ●●

●●●

●●

●

●● ●●● ●●●●

●

● ●● ●

●

● ●●●● ●● ●●●● ●●

●

● ●●●●

●

●●●● ●

●●●●●

●

●

● ● ●●●

●

●● ●● ●●●

●

●● ●●●●●●

●

●

●

●

●● ●

●●●●

●

●●

●

●●● ●●●● ●●●

●●●●

●

●●●●

●

●●

●

●● ●●

●

●●●●●●● ●

● ●

●●●

● ●●●

●

●● ●●

●

●

●

●

●

● ●

●

● ●

●

●●●●●● ●

●● ●

●

●

●

●●

●●●

● ●

●●● ●●

●● ●●●●

●●● ● ●●● ●

●●●

●

●● ●● ●●●

●

●●● ●

●

●●● ●

●

● ●●●●● ●●● ●● ●●

●●●●● ●

●● ● ●●●

●●

● ●

●

●● ●●

●●

●●●●

●

● ●●●● ●●●●●●

●●● ●●● ●●

●

●●●●● ●● ●●● ●

●

●●●

●● ●● ●●

●

●

●●●

●

●

●

●●

●●●●●

●

●

●

● ●●●●● ●●●

● ●● ●

●●●●●●

●●●●●

●●●●

●● ●●●

●

●● ●●●● ●●●

●

●●

●●● ●●●●

●●●●

●

●●● ●●

●

● ●●●

●●●

●

●

●

●● ●●●

●

●

● ●●●●●

●●●●●●

●●●●● ●

●●

●

● ●●●●

●● ●●

●●

●

●●

●

●●

●●●

●

●●

● ●●●●

●

●● ●

●

●●●●

●●●●●

●

●

●●

●● ●●●●

●●● ●

●●●●● ●

●●●

●

● ●

●

● ●●●

●● ●●

● ●

●

●●

●

●●● ●●●●●

●●●

●

● ●●●●●●● ●●

●● ●

●

●● ● ●● ●

●●

●

● ●●

●

●●● ●

●

●●

●●

●

●●●●

●●●● ●●

●

●●

●

●

●●

●●●

●

●● ●

●

●●●●● ●

●●●●

●

●● ●●●●● ●●

●●

●●● ●

●

●

● ●

●●●● ● ●●●

●

●●

●

●●● ● ●

●

● ●●

●

●●●

●

●

●●● ●●●●●

●● ●●●

●

●●● ●

●

● ●● ●●●

●● ●●

●●●

●●

●

●

●

●

● ●● ●●

●

●

●

●● ●●

●

●●

●●

●

●

●

●● ●

●

●● ●●

●●

●●●●●● ●●

● ●●●

●●

●

●

● ●●●●●● ●●●● ●●●

●

●

● ●●●● ●●

●

●●

● ●●

●●●●●●●● ●●

●

●●

●

●● ●● ●●●●

●

●

●

●

●●●

●

●●●● ●●● ●

●

● ●●

●

●

●●● ● ●●

●

●●

●●●●

●●

●

●●

●●

●

●●

●

● ●●●

●●

●

● ●● ●●

●

●

●

●

●●

●● ●● ●● ●●●● ●● ●●● ●● ●●● ● ●●

●●●●

●●●●

●

●●

●●

●●

●● ●●● ●

●

●

●●

●●

●

●●

●●● ●

● ●●

●●●●

●●● ●●●● ●

●●●

●●● ● ● ●●

●

●●●●●

● ●

●

● ●●●●

●

●●

●

●

●

●

●●●

●

●

● ●● ●●

●

●

●

●●● ● ●●●●

●

● ●●

● ●●

●

●

●

●● ●● ●●●●

●

● ●●●●● ●● ●

●●

●

●

●● ●●●

●● ● ●

●● ●

●

●●

● ●●

●

●● ● ●●

●

●●● ●●● ●● ●● ●

●

● ●●●

● ●●

●●●

●●●

●

●●●

●●

●

●

●

●●

●●

●● ●●

● ●● ●● ● ●●

● ●

●

●

●

●●● ● ●●● ●

●

●● ●● ●●●●●● ●

●

●

●

●●●●●●

●

●●●

0 2 4 6 8 10 12

01

23

45

6

grey cor=−0.023, p=0.14

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●

●

●

●

●

●●

●

●

●

●

●

●

●

● ●

●●

●●

●

●●

●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

● ●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●●

●

●●

●●●●

●

●

●

●

●

●

●

●

●●●

●

●●

●

●

●●

●

●

●

●

●●

●●

●

●

●

● ●

●

●

● ●●●

●●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

● ●

●

●

●●

●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

0 5 10 15 20 25 30

0.0

0.5

1.0

1.5

pink cor=−0.066, p=0.36

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●● ●

●

●●

● ● ●

●

●

●●●

●●

●●

●

●

●

●

●

●● ●●

●

●●

●

● ●●

●

●

●● ●

●● ●●

●●

●

●

●

●

●

●

●● ●

●● ●

●●●

●

●● ● ●

●

●

●

●●●

●

●

● ●●

●

● ● ●●

●

●●●●

●

●

●●

● ●

●●

● ●

●

●● ●

●

●●●

●

●

●

●● ●

●

●●

●●

●

●●

●

● ●● ●

●

● ●

●

●

● ●

●

●●

●

●

●

●● ●

● ●● ●● ●●●

●

●

●

●

● ● ●● ●●●

●

●

● ●●●●●●

●

●

●

● ● ●●

●

●●

●● ●● ●●● ●

●●

●●●

●

●

●●●●

●●●

●

●●

●

●

●

● ●

●

●●

●

● ●

●

●●

●

●● ● ●●

●●●

●

●

●●

●

● ● ●

●

●●●●

●

●

●

●

●

●

● ● ●●●

●

●● ●●●

●●

●

● ●● ●● ●

●

●

●

●● ●●

●●●●

●

● ●

●●●●

●

● ●●

● ●

●

●●

●●

●● ●

●●

●●

●

●●

●●● ●● ● ●● ● ● ●●

●

●

●

●

● ●● ●

●●

● ●●

● ●●●

● ●

●

● ●

●

●

●

●●●

●

●

● ●

●

●

● ●

●

●

●●

●

●

●

●● ●

●●

●

● ●●●●●

●● ●

● ● ●●●

●

●●

●

●●

●

●

●●

●●

●●

●

●

●

●●

●

●●

●

●

●

●

●●

●

● ●●●●

●

●

●

●

●

●

●

●

●● ●● ●●●

●

●●● ●

●

●

●

●●●● ●●

●

●●

●

●●

● ●●

●●

● ●● ● ●●●

●

●●

●

●

●●

●

●

●●

●● ● ●

●●●●

● ●

●●

●● ● ●

●

●

●●

●●

● ●●

● ● ●

●

●

●

●

●●● ●

●

● ● ● ● ●● ●●● ●●

●

●●

●●

●●●

●

●●

●● ●

●● ●

●

●

●●

●● ●

●

●

●

●

●

●

●

●●●

●●

●

● ● ●●●

●● ●

● ●

●●●●

●

●●

●●

●

●

●

● ●●

●

●

●●

●●●

●

●

●● ●● ●

●

●

●

●

●

● ●●●

● ●●

●●●

●

●

●●

●●●●●

●

●

● ●

●

●

●

●

●

●

●

●

●

●●

●

● ● ●●●

●

●

●

● ●

●

●

●

●●

●

●

●

●

● ●●

●

●● ●

●

●

●

●

●

● ●●

●

●

● ●●● ●

●

●●

●

●● ●

●●●

●●

●

● ●●

●

●

●

●●

●

●

●

●

●

●

●

●

●● ●●

●

●●●●

●● ●

●

●●

●●

●

●●

●● ●●

●

● ● ●● ●

●

●●

●

●

●

●

●

●

●●

●●●●

●

● ●

●

●●

●

●

●

●●

●

● ●●●

●

●● ●

●●●● ●

●

●

● ● ●●

●●

●

●

●

● ● ●●● ●

●

●●

●

●

●●

●●

● ●

●

●

●●●●

●

●

● ●●

● ●

●

● ● ●●

●●●●●●

●●

●● ●●●

● ●●

●

●

●●

●

●

●

● ●●●

●

●

● ●●

●

●●

● ●●●● ●● ●●

●

●

●

● ●●

●

●●

●

●

●● ●● ●●●

● ●

●

●● ●

●●●

●

●

●

●●●

●

●

●●● ●●●●●

●●

●●

●●●

●

●● ●●

●

●●● ●●

●

●●●

●

●●

●● ●

●

●●

●●●

●

●

●

●

● ●● ●●● ●● ●● ●

●●●

● ●

●

●●

●

●●

●●

●

●

●●●

●

●

●●● ●

●

●

●● ●

●

● ●● ●

●

●● ●

● ●

●

●●

●●● ●

●

●●

●●●

●● ●

●

●

● ●

●

●●●

●● ●● ●●

●

●

● ● ●

●

● ● ●

●●

●

● ●● ●

●

●

●

●

● ●

●

●●

●

●

●● ● ●●

● ● ● ●● ● ●●●

●

●● ●

●

●●●●

●

●●● ●●●●

●●

●●

●

●

●

●●●● ●

●

●

●● ●

●

●

●

●

●

●

●

●●●

●● ●

●● ●● ● ●●●

●

● ●

●●

●

●●● ●

●

●

●

●

●●●●●

●●●

●

●●● ● ●

●

●●

●● ●● ● ●

●●

●●●

● ●●

●

●●

●

● ●

●

●

●●

●

●●●

●

●

●●●●●●

●●

50 100 150 200 250

01

23

45

turquoise cor=−0.0093, p=0.75

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●

●●

●●

●

●

● ●●

●

●

● ●

●

●

●●

●

●

●

●

●

●

●

● ●

●

●

● ●

●

● ●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

● ●●●

●

●

●

●

●●

●

●

●●

●

●

●●● ●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

5 10 15 20 25 30

0.0

0.5

1.0

1.5

magenta cor=0.11, p=0.19

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●●

●● ●

●

●●

●

●●●●●

●

●

●

●

●●●

●●● ●

●●

●

●

●

●

●

●

●

●

●●●

●●

●●

●

●●●●

●

●●

●●

●●

● ●●

●

●●

●●

●

● ●●●

●●

●●● ●

●●

●

●

●●

● ●●

●●

●

●

●●●

●

●●

●

●

●

●●

●

●●●

●●

●●

●●

●

●

●●

●

●●●●

●●● ●● ●●●●

●●

●●

●

●

●●

●

●● ●●

●

●

●

●

●●

●

●●

●●●

●

●

●● ●

●

●

●

●

●●

●

●

●●

●●

●

●

● ●●

●●●●

●

●● ●

●●●

●● ●●

●

●

●●

●●

●

●

● ●●

●

● ●

●

●●

●●

●

●● ●

●● ●

●

●

●

●

●

●

●

●

●

●●

● ●

●●

●

●

●

●

●

●●

●

●

●●●●

●

●

●●

●

●● ●

●

●

●●

●● ●●

●

●●●

●

●

●

●

●●●

●

●●●

●●

●

●

●●

●●● ●

●

●● ●●●●

●

●●● ●

●

●

●● ●●●● ● ●

●

●● ● ●

● ●

●●

●●

●

●

●●●●●

●●

●● ●●

●

● ●●

●

● ●●

●

●

●●

●

●

● ●

●●

●●

●●

●

● ●●●● ●●●

●

●

●●●

●

●

●

●●●●●

●●● ● ●

●●

●

●

● ●●

●●●

●●

●●

●

●●●

●

●

●

●●

●●● ●●

●

●

● ●

●●

●●

●

●

●

●●

●

●

●

●

●●

●

●●●

●

●

●

●

●

●

●●

●

●

● ●

●

●

●

●

●

●

●

●●

●

●

●●

●●●

●●

●

●●

● ●●

● ●●●

●

●●

●●●●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●● ●●● ●

●

●

● ●

●

●●● ● ●

●●

●●●

●●●

●

●● ●

●

●●

●●

●●

●●

●

●

●

●●

●●

●●

●●

●●●

●●●

●●

0 10 20 30 40

0.0

0.5

1.0

1.5

2.0

2.5

red cor=−0.09, p=0.036

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●

●

●

●

●

●

●●

●●

●

●●●

●●●●●●

●

●●

● ●●●

●●● ●●●

●

●●●●●

●

●●

●

● ●●

●

●

●

●●●●●

●

●●

●

●

●

●

● ●

●●

●

●

●●

●

●

●

●●●●

●

●●

●●

●●

●

●●●

●● ●●●● ●● ● ●●●●

●

●● ●

●

●

●

●● ●

●

●●

●

●

●

●

●●●● ●●

●

●●

●

●

●● ●●

●●● ●●

●● ● ●●●

●

●● ●

●

●●

●●

●

● ● ●● ● ●

●

●

●

●● ●

●

●●●

●●

●

●●●

●

●●

●

●●● ●●●●

●

●

●

●●

● ●

●

● ●

●

●● ●●

●

●●

●

●●

●

●●

●

●

●

●

●

●

●

●●

● ●

●

●

●

●

●●●●

●●

● ●

●

●●

●●●

● ●●●

●●

●

●

●

●●

●

● ●● ●●

●●

●●●

●

●●

●

●●

●

●

●●

●●

●

●

●

●●

●

●

●

●

● ●●●

●

●●●

●

●●

●●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●●

●● ●

●

●

●

●

●

●●●● ●

●●

●

●

●●●

●

●●● ● ●●●●

●

●

●

●●

●

● ●

●

●●

●●

●

●

●● ●

●

●●

●●● ●

●

●

●●

●

●

●

●●

●

● ●●●

●●●●●● ●

●●

●

●

●

●

●

●

● ●●●●●● ●● ●●

●

●

●

●

●

● ●●

●

●●●

●● ●

●

●

●

●

●

●●●

●

●●●

●

●●●

●

●●●●●● ●● ●●● ●

●

●

●

●

●

●

●

●

●●●●

●

●●●

●

●

●

●

●

●

●

●

●

● ●●

●● ●●●

● ●

●●

●●● ●● ●●

●

●

●●

●

●●●

●●

●

●

●

●●●

●

●●●●

●●

●●

●●●●●●

●

●

●

●

●●

●●

●●● ●

●

●●●

●

● ●●●● ●●●●● ●

●

●●● ●

●

●

●● ●● ●

●

●

●●

●

●

●●●●●

●

● ●●

●●●

●

●

●●●●

●

●

●●●

●

●

●

●●●●●

●●●●

●

●● ●●● ●●●

●● ●

●●●●●

●

●

●

●

●●

●

●●●

●●● ●

●

●

●

● ●●

●

●●●

●

●●●●●●

●

●

●

●●●

●●

●

●

●●●●

●

●● ●● ●

●

●

●

●

●●●● ●●●●●

●

●

●●

●

●●

●

●●●

●● ●

●●

●

●●

●

●

●

●

●●●

●●●

●●

●

●●●

●●

●

●●●

●●● ●●●

●●●

● ● ●●

●

● ●● ●●● ●● ●●●

●●●●●● ●●●

●

●●●

●

●

●●

●

●

●

●●

●

●

● ● ●●

●

●●

●

●

●

●● ●

●

●●

●

● ●

●

●

●

●●

●●●

●

●

●

●●

●●

●

●

●●

●

●● ●

●

●●

●●

●

●

●●

●

●

●

● ●

●

●●

●

●

●●●●● ● ●

●

●●●●

● ●●

●

●●● ●●● ●●

●

●

●

●

●●

●

●●

●

● ●

●

●●● ●●●

●

●●●● ●

●

●

●

●

●

●●

●

●

●●

●

●●

●

● ●● ●

●

●●● ●●

●

● ●●

● ●●

●

●●●●

● ●

●

●

●●

●

●

●

●●● ●●

●

●●

●

●●●

●●

●

●

●●●

●

●●●●●

●

●

●

●

● ●●●●●

●

●

●

●●●

●●●

●●●

●

●●●

●●

●

●

●

●

●● ●●●

●●

●

●

●

●

● ●

●

●●

●●

●

●●

●

●●

●

●

●

● ●●

●

●●●● ●● ●● ●●

●● ●

●

●●●

●

●● ●●● ●

●

● ●●●● ● ●●●

●

●

●●

●

●

●

● ●●

●

●

●●●

●● ●●●●

●

●●●● ●●●

●

●

●

●● ●

●

● ●●●●

●

●●

●

●●

●

●●

●●

●● ●●

●

●●

●●●● ●●

●

● ●

●●

●

●

● ●●

●

●●●●

●

●

● ● ●●●● ●●●

●●●

● ●●

● ●

●

●●

●

●

●

●

0 5 10 15 20 25

01

23

4

blue cor=0.28, p=2.3e−22

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●

●

●●●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●●

●

● ●

●

●● ●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

● ●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●●

0 2 4 6 80

12

34

tan cor=0.5, p=1e−07

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce●● ●

●

●

●●

●

●

● ●

●

●

●

●●●

●● ●

●

●●

●

●

●

●●●●●

●

●

●

● ●●

● ●

●●

●

●

●●●● ●●

●

●

●

●

●

●● ●●

●●

● ●

●

●

●

●

●

●●

●

●●●●

●

● ●●

●

●●●●●● ●●●

●

●

●●

● ●●●

●

●● ●

●

●●●

●

● ●

●

●● ●●●●

●

●

●

●

●

● ● ●●●●

●

●●●

●

● ●●●

●

●

●

● ●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●●●●●●●

●

●

●●

●

●●

●●

●

●

●●

●●● ●●

● ●●●

●

●●

●●

●

●●●●●●

●

●

●●

●

●●

●●

●

●

●

●●

● ●●

●

●

●

●●●

●●●● ●

●

● ●●

●

●

●

●

●●● ●

● ●

●

●

●●

●● ●●

●

●●

●

●

●

●

●

●●●

●

● ●●●●

●●

●●

●●

●

●

●●

●

●

●●

●

●

●

●

●

●● ●●●

●●

●●●

●

●●●

●●

●

●

●

●

●●

●

●

●

●●

● ●

●

●

●●

●

●

● ●●

●

●●

●

●● ●

●

●

●

● ●●

●●

●

●●● ●●

●

●

●●● ●

●●● ●●●●

●

●

●

●●●●

●

●● ●

●

●●

●

●

●

● ● ●●

●

●●●

●

●●●●

●●●

●●●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●● ●

●●●●● ●●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●●

●● ●

●

●

●●● ●

●●

●●

●●●

●

●● ●

●

●●●

●●●●●

●

●

●

●

●

●●●

●

●

●

●●

●

●●

●

●

●

●●● ●●●●

●● ●

●●

●

● ●

●

●

●●

●

●●

●

●

●

● ●●●●●

●

●

●●●●

●

●

●

●

●

●●

●

●

●

●●

● ●●●●●●

●●

●

●

●●

●

● ● ●

●

●

●● ●●

●

●

●

●●

● ● ●●●●●

●

● ●●●

●

●

●

●

●

●

● ●

●

●●●●

●

●

●

●

●

●

●

●● ●●

●●●

●

●●

●

●

●

●●

●

●

●

●●

●

●

●

●●●

●

●

● ●●●

●

●

● ●●

●

●

●

●●

●●●

●

●

●

● ●

●

●●

●●

●

●●

●

●●●

● ●●●

●

●●●●●

●●

●

●

●

●

●

●

●● ●●

●

●

●

●●●

●

●

●

●

●●

●

●

●●

●

●

●

● ●●

●

●

●●

●

●

●●●● ●●

●●

●

●●●● ●

●

●

●

●

● ●●

●

●●

●●●

●

●●

●

● ●●● ●

● ●

●

●

●

●●●●●

●

●

0 20 40 60 80

01

23

45

6

brown cor=0.61, p=5.9e−79

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

● ●

●●

●

●●●●

●

●●

●

●

●●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●●

●●●●

●

●

●●●●

●●

●

●

●

●

●

●●

●

●

●

●●

●

●

●●

●

●●●

● ●

●●●

●

●

●

●

●

●●

●

●●

●

●●●●

●

●

●●

●

●●

●

●

●

●●

●

●●●

●●

●

●

●

●

●

●●

●●●●

●●●●

●

●●

●●

●●●●

●●

●

●

●

●●

●

●●

●

●

●

●

●●

●●

●

●

●

●

●

●●●●●

●

●

●

●

●

●●

●

●

●

●

●●

●●

●

●●

●

●

●

●●

● ●●●

●

●●●

●

●

●

●●

●

●

●●●

●

●

●

●

●

●

●

● ●

●

●●

●

●

●

●●

●

●

●●

● ●

● ●

●

●

●

●

●

●

● ●●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

● ●

●

●

●

●

●

●

● ●●

●●

●

●

●

●

●

● ●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●●

●

●

●

●●

●

●

●

● ●

●

●●●

●

●●●

●

●

●

●

●

●

●

● ● ●

●

●●

●

●●

●●

●

●●

●

●

●

●●

●

●

●●

●

●●

●

●●

●●●●●●

●●●

0 5 10 15 20 25 30

0.0

0.5

1.0

1.5

2.0

2.5

black cor=0.24, p=2.6e−06

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●●●

●

●

● ●

●

●

●●●

●

●● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●●

●

●

●

● ●

●

●●

●

●●

●

●

●

●●

●

●

●

●

●

● ●

●

●

●

●

●

●●●

●

●

●

●●

●

●

●

● ●

●

●

●

●

●●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

● ●

●

●

●

●

●●●

●

●●

●

●

2 4 6 8 10

0.0

0.1

0.2

0.3

0.4

0.5

greenyellow cor=−0.13, p=0.14

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●●●

●

●

● ●● ●

●●

●

●

●●

●

●

●

●

●

●●●

●●

●

●

●

●●●

●●

●

● ●●

●

● ●●●

●

●

●●

●●

●

●●●●

●

●

●●

●

● ● ●

●

●

●

●

● ●●

●●

●

●

●●●● ●

●

●

●

●

●●●

●

●

● ● ● ●●

●

●

●● ●●●●

●● ●

●●●●

●●

●

●●

●●

● ●

●

●

●

●●

●●● ● ●●

●●

●

●

●●●

● ●

●

●

●●

●●

●●

● ●

●

●

●

●

● ●

● ●

●●

●● ● ●● ●●

●●

●

●

●●

●

●

●

●

●●●

●

●

●

●●

●

● ●

●

●

●

●

●●

●

●●

●

●●

●●

●

●

●●

●●

●●

●

●

● ●●

●

● ●●

●

●●● ●

●

●●

●●

●●

●●●●

●

●

●●●

●●

● ●

●●

●

●

●

●

●●●●●

●

●

● ● ●● ●●

●●

● ●

●●

●

●

● ●

●

●● ●

●●

●

●●●

●●● ●●

●

● ●●

●

●

●

●

●

●

●

● ●●●

●

●●

●

●●

●●

●

●●

●●● ●●●

●

●●●

●

●

●●

● ●

●

●●

●●

●●●

●● ●●

●●

●

●

●●

●

● ●●

●

●

●●●

●

●

●●

●

●

●

●

●

●●●●●

●

● ●

●

●

●

●

●

● ●●● ●

●●

●●

●

●●

●●● ●●

●

●● ●●

●

●●●

●

●●

●

●●

● ●●

●

●

●

●

●

●● ●

●●

●

●

●●

●●●

● ●● ●

●●

● ●● ●

●

●●

●

●

● ●●

● ●

●●

●●●

●

● ●●

●●

● ●●

●

●

●

●

●●

● ●

●

●●

●● ●

●

●

●●●●

●

●

●

● ●

●

●

●

●

●

●

●

●

●

●●●

●●

● ●● ●●● ●●

●

●

●

●●

●

●

●●

●●

● ●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●● ●

● ●

●

●●●●

●

●●●●● ●

●●

●

●

●

●

●

●

●

● ●●●

●

●

●

●

●●●

●

●

●

●●●●

●

●

●

●

● ●●

●

●●

●●

●

●

●

●

● ●

●●●

●

●

●

●●●

●

●

●●●

●

●●

●

●● ●

●

●

●

●

●

●●

0 10 20 30 40 50 60

0.0

0.5

1.0

1.5

yellow cor=−0.044, p=0.27

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●●●

●

●●

●

● ●● ●

●

●

●

●●●

●

●

●●

●●●●

●

●

●

● ●

●

●

●

●

●●

●

●● ●

●

● ●●

●

●

●

●

●

●

●●●●

●

●

●●

●●●

●

●●

●

●

●

●

● ●●

●●

●

●

●●●

●● ●

●

●

●● ●●● ●

●

●

●

●●●●

●●

●●

●

●

●●

●●

●

● ●●●

●●

●●

●

●

●

●

●● ●● ● ●●

●

● ● ●●

●

●●

●●

●

●● ●

●

●

●

●

●

● ●●● ●

●

●

●●

● ●●

●●

●

● ●

●

●

●

●

●●●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

●●

●

●●

●

●● ●●

●

● ●

●●

●

●

●

●●

●

●

●●

●●

●●●●

●●

●●●

●

●

●

●●

●

●

●●●

●●

●

●●●

● ●●●●●

●

● ●● ● ●●

●

●

●

●

●

●

●●●

●

●● ●

●●

●●●●

●●

●

●

●●

● ●

●

●

●●●

●

●

●

●

● ●●●

●

●●●

●

●

● ●●

●

●●●

●

●

●

●

●

●

●●

●

●

●●●

● ●

●●●

●●●●

●

● ●●●●●

● ●

●

●●

● ●

●●

●

●

●●

●● ● ●●

●●●

●

●

●

● ●●●

●

●

●●

● ●●

●

●

●

●

●●●●

●●

●

● ● ●●●

●●

●

●●●●

●

●

● ●

●

●●

●● ●● ●

●●●

●

●

●●

●

●

●

●●●

●

●

● ●

● ●●

●

● ●● ●

●●

●

●

●●

●● ●●

●

● ●●●

●

●

●

●

●

●●

●

● ●●● ●

● ●

●

●

●

●

●

●

●●●● ●

●● ●●

●

●●●

●●●●

●●

●

●

●

●● ●● ●

●

●● ● ●

●●

●●●

●●

●

●

●

●

●

● ● ●

●

●

●

●

●●

●

●

● ●

●

●

●●

●

●

●

●

●

●

●

●

●●● ●●●●●

●

●

●●

●

●

●●

●

●●

● ●●

●●

●

●

●

●

●●●●

●

●

●

●●● ●

●

●

●●

● ● ●● ●

●

●

●

●

●

●●

●●

●

●● ● ●

● ●

●●

0 5 10 20 30

0.0

0.5

1.0

1.5

2.0

green cor=−0.079, p=0.054

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●●

●

● ●

●●

●

●●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

● ●

●●

●

●

●

● ●

●

● ●

●

●●

●●

●

●

●

●

●●●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●● ●

● ●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●● ●

●

●

●

●

●●

●

●

●

●

●

●

●

●●●●

●

●

● ●

●

●

● ●

2 4 6 8 10 12

0.0

0.1

0.2

0.3

0.4

0.5

0.6

purple cor=−0.094, p=0.27

Connectivity

Gen

e C

OE−C

OE

W S

igni

fican

ce

Figure 8 Intramodular connectivity and module significance.

Intramodular connectivity measures how connected, or co-‐expressed,

a given node is with respect to the nodes of a particular module. It is

the connectivity in the subnetwork defined by the module.

35

Figure 9 STRING protein network.

The edge colors represent different evidences. Neighborhood: green;

Gene Fusion: red; Coocurrence: blue; Coexpression: black;

Experimental: magenta; Databases: cyan; Textmining: greenyellow;

Homology: light-‐blue.

Figure 10 Labeling in weighted network.

Different labelings in the network represent different data. Node

color: module color; node border color: significant clusters in STEM;

node shape: significant genes in moderated F test are diamond shape,

36

while not significant genes are round shape; node label color: ASM

candidate genes are blue, Heart candidate genes are red.

Figure 11 The 1st module inferred by AllegroMCODE for

unweighted co-‐expression network.

37

Figure 12 The 1st module of unweighted co-‐expression network

enrichment.

Figure 13 The 1st module inferred by AllegroMCODE for weighted

co-‐expression network.

Figure 14 The 1st module of weighted network enrichment.

38

Figure 15 Ribosome group in the String.

Differential GO-term Distribution

Test Set Reference Set

0 5 1 0 1 5 2 0 2 5 3 0 3 5 4 0 4 5 5 0 5 5 6 0 6 5 7 0 7 5 8 0 8 5 9 0 9 5% Sequences

ribosomestructural constituent of ribosome

translationribonucleoprotein complexstructural molecule activity

cytosolic ribosomecytosolic part

small ribosomal subunittranslational elongation

cellular protein metabolic processgene expression

cellular macromolecule biosynthetic processmacromolecule biosynthetic process

cytosolic small ribosomal subunitendocrine pancreas development

translational terminationprotein metabolic process

non-membrane-bounded organelleintracellular non-membrane-bounded organelle

macromolecular complexcellular protein complex disassembly

cellular macromolecular complex disassemblyprotein complex disassembly

endocrine system developmentmacromolecular complex disassembly

cellular biosynthetic processpancreas development

viral genome expressionviral transcription

viral infectious cyclecellular component disassembly

viral reproductive processbiosynthetic process

cytosolreproductive cellular process

cellular macromolecule metabolic processviral reproduction

cytoplasmic partmacromolecule metabolic process

large ribosomal subunitreproduction

ribosome biogenesismacromolecular complex subunit organization

cellular macromolecular complex subunit organizationreproductive process

rRNA metabolic processrRNA processing

cytosolic large ribosomal subunitcytoplasm

cellular metabolic processrRNA binding

ribonucleoprotein complex biogenesisprimary metabolic process

RNA bindingribosomal small subunit biogenesis

ncRNA processingdevelopmental process

intracellular organelleorganelle

multicellular organismal developmentncRNA metabolic process

system developmentorgan development

erythrocyte homeostasisintracellular

metabolic processcellular component biogenesis

GO

Te

rms

Figure 16 Ribosome group in STRING network enrichment.

39

Figure 17 Ribosome group and COE.

Figure 18 Grey color genes.

40

Figure 19 Tan module

Figure 20 Brown module

41

Figure 21 Turquoise module enrichment.

Figure 22 Genes in turquoise plus STEM condition.

42

Figure 23 Genes of Turquoise plus STEM condition enrichment.

Figure 24 Sub-‐group of candidate genes in unweighted network.

43

Differential GO-term Distribution

Test Set Reference Set

0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5% Sequences

cardiac muscle tissue developmentheart process

heart contractionpositive regulation of heart contraction

striated muscle tissue developmentmyofibril assembly

actomyosin structure organizationmuscle contraction

muscle tissue developmentcardiac cell differentiation

muscle system processactin filament-based movement

circulatory system processblood circulation

regulation of heart contractionheart development

muscle structure developmentcellular component assembly involved ...

striated muscle cell developmentsarcomere

contractile fiber partmuscle cell development

heart morphogenesismyofibril

contractile fiberstriated muscle cell differentiation

system processanatomical structure formation involved ...

striated muscle thin filamentsarcomere organization

cellular component morphogenesiscardiac myofibril assembly

stress fiberactin cytoskeleton organization

muscle cell differentiationmuscle organ development

positive regulation of multicellular organismal processactin cytoskeleton

positive regulation of cell adhesioncardiac cell development

cardiac muscle cell developmentactin filament-based process

GO

Te

rms

Figure 25 Sub-‐group of candidate genes in unweighted network

enrichment.

Figure 26 ASM candidate genes in weighted network enrichment.

44

Figure 27 ASM and Heart candidate genes

Part A illustrates the generation of ASM and Heart cells from TVC. Part

B summerizes different temporal expression groups of ASM and Heart

candidate genes, with the count numbers and known markes. Arrows

represent the trend of their temporal expression.

45

References

1. BARABASI, A. and BONABEAU, E., 2003. Scale-‐free networks. Scientific American, 288(5), pp. 60-‐69.

2. BARABASI, A. and OLTVAI, Z., 2004. Network biology: Understanding the cell's functional organization. Nature Reviews Genetics, 5(2), pp. 101-‐U15.

3. CHRISTIAEN, L., DAVIDSON, B., KAWASHIMA, T., POWELL, W., NOLLA, H., VRANIZAN, K. and LEVINE, M., 2008. The transcription/migration interface in heart precursors of Ciona intestinalis. Science, 320(5881), pp. 1349-‐1352.

4. CONESA, A., GTZ, S., GARCA-‐GMEZ, J., TEROL, J., TALN, M. and ROBLES, M., 2005. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Oxford: Oxford University Press.

5. COSTA, I., SCHNHUTH, A. and SCHLIEP, A., 2005. The Graphical Query Language: a tool for analysis of gene expression time-‐courses. Oxford: Oxford University Press.

6. DAVIDSON, B., 2007. Ciona intestinalis as a model for cardiac development. London, UK: Academic Press.

7. EISEN, M.B., SPELLMAN, P.T., BROWN, P.O. and BOTSTEIN, D., 1998. Cluster analysis and display of genome-‐wide expression patterns. Washington, D.C.: National Academy of Sciences.

8. ERNST, J. and BAR JOSEPH, Z., 2006. STEM: a tool for the analysis of short time series gene expression data. London: BioMed Central.

9. ERNST, J., NAU, G. and BAR JOSEPH, Z., 2005. Clustering short time series gene expression data. Oxford: Oxford University Press.

10. FREEMAN, T., GOLDOVSKY, L., BROSCH, M., VAN DONGEN, S., MAZIRE, P., GROCOCK, R., FREILICH, S., THORNTON, J. and ENRIGHT, A., 2007. Construction, visualisation, and clustering of transcription networks from microarray expression data. San Francisco, CA: Public Library of Science.

11. GENTLEMAN, R., 2005. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. New York: Springer-‐Verlag.

46

12. GREEN, Y. and VETTER, M., 2011. EBF proteins participate in transcriptional regulation of Xenopus muscle development. San Diego [etc.]: Academic Press.

13. HORVATH, S., 2011. Weighted Network Analysis : Applications in Genomics and Systems Biology. New York: Springer.

14. KAUFFMANN, A., GENTLEMAN, R. and HUBER, W., 2009. arrayQualityMetrics-‐-‐a bioconductor package for quality assessment of microarray data. Oxford: Oxford University Press.

15. LANGFELDER, P. and HORVATH, S., 2008. WGCNA: an R package for weighted correlation network analysis. Bmc Bioinformatics, 9, pp. 559.

16. LANGFELDER, P., ZHANG, B. and HORVATH, S., 2008. Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Oxford: Oxford University Press.

17. QIAN, J., DOLLED FILHART, M., LIN, J., YU, H. and GERSTEIN, M., 2001. Beyond synexpression relationships: local clustering of time-‐shifted and inverted gene expression profiles identifies new, biologically relevant interactions. London,: Academic Press.

18. SHANNON, P., MARKIEL, A., OZIER, O., BALIGA, N., WANG, J., RAMAGE, D., AMIN, N., SCHWIKOWSKI, B. and IDEKER, T., 2003. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press.

19. SHANNON, P., REISS, D., BONNEAU, R. and BALIGA, N., 2006. The Gaggle: An open-‐source software system for integrating bioinformatics software and data sources. Bmc Bioinformatics, 7, pp. 176.

20. SMYTH, G., 2004. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. [Berkeley, CA]: Berkeley Electronic Press.

21. STOLFI, A., GAINOUS, T.B., YOUNG, J.J., MORI, A., LEVINE, M. and CHRISTIAEN, L., 2010. Early Chordate Origins of the Vertebrate Second Heart Field. Science, 329(5991), pp. 565-‐568.

22. SZKLARCZYK, D., FRANCESCHINI, A., KUHN, M., SIMONOVIC, M., ROTH, A., MINGUEZ, P., DOERKS, T., STARK, M., MULLER, J., BORK, P., JENSEN, L. and VON MERING, C., 2011. The STRING database in 2011:

47

functional interaction networks of proteins, globally integrated and scored. [London]: Information Retrieval Ltd.

23. TAMAYO, P., SLONIM, D., MESIROV, J., ZHU, Q., KITAREEWAN, S., DMITROVSKY, E., LANDER, E.S. and GOLUB, T.R., 1999. Interpreting patterns of gene expression with self-‐organizing maps: methods and application to hematopoietic differentiation. Washington, D.C.: National Academy of Sciences.

24. TUSHER, V., TIBSHIRANI, R. and CHU, G., 2001. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America, 98(9), pp. 5116-‐5121.

25. WETTENHALL, J. and SMYTH, G., 2004. limmaGUI: A graphical user interface for linear modeling of microarray data RID B-‐5276-‐2008. Bioinformatics, 20(18), pp. 3705-‐3706.

26. ZHANG, B. and HORVATH, S., 2005. A general framework for weighted gene co-‐expression network analysis. Statistical Applications in Genetics and Molecular Biology, 4, pp. 17.

Education

Visualization hang zhong