Different computational approaches have been examined and compared for inferring network relationships from time-series genomic data on human disease mechanisms under the recent Dialogue on Reverse Engineering Assessment and Methods (DREAM) challenge. Many of these approaches infer all possible relationships among all candidate genes, often resulting in extremely crowded candidate network relationships with many more False Positives than True Positives. To overcome this limitation, we introduce a novel approach, Module Anchored Network Inference (MANI), that constructs networks by analyzing sequentially small adjacent building blocks (modules). Using MANI, we inferred a 7-gene adipogenesis network based on time-series gene expression data during adipocyte differentiation. MANI was also applied to infer two 10-gene networks based on time-course perturbation datasets from DREAM3 and DREAM4 challenges. MANI well inferred and distinguished serial, parallel, and time-dependent gene interactions and network cascades in these applications showing a superior performance to other in silico network inference techniques for discovering and reconstructing gene network relationships.
Many established algorithms and approaches are available for inferring gene regulatory networks from large time-course molecular data [
We applied MANI to time-course gene expression data of a 7-gene network during adipocyte differentiation (adipogenesis) [
The goal of MANI is to locally infer gene regulatory relationships with sequential blocks (modules), each containing three genes (shown as a metaphorical window in Figure
Schematic of MANI steps. Step
Possible gene regulatory relationships within a three-gene module. The three genes within a MANI module are labeled
A local network module that contains the three most strongly correlated genes was identified by evaluating spearman rank correlations from time-series gene expression profiles. Regulatory relationships between genes within the module are inferred by selecting the optimal gene relationships from a list of possible regulatory relationships (Figure
Regulatory gene relationships are mathematically modeled and fitted to gene expression data and the optimal relationship is identified using the goodness of fit measure. Figure
Parameters representing regulatory relationships between three genes (
We use the Bayesian Information Criterion (BIC) (
The MANI approach was implemented on time-series gene expression data obtained from a network of seven genes that belong to an adipogenesis regulatory network [
Window #1 network inference. (a) Time-series gene expression data of 7 genes within the adipogenesis network collected at 0, 6, 12, 24, 48, 72, 96, and 672 hours during adipocyte differentiation [
Values of kinetic parameters for regulatory relationship in window #1 (RR #1 in Figure
Parameters | Mean ± standard error (hr−1) |
---|---|
| 0.12 ± 0.02 |
| 0.1 ± 0.02 |
| 0.13 ± 0.05 |
| 0.28 ± 0.04 |
| 0.24 ± 0.05 |
| 0.03 ± 0.01 |
| 0.02 ± 0.01 |
| 0.06 ± 0.03 |
The first two genes in initial windows were selected as the pair(s) of genes with maximum correlation between time-series expression data. A third gene was added by choosing a gene with maximum correlation with either of the genes forming the pair. Among the seven genes (Figure
The possible regulatory relationships between the three genes within a window (listed in Figure
Window #2 optimal regulatory relationship. Expression profiles of all three genes showed nonzero lags (CEBPa (6 hours), CEBPg (6 hours), and PPARg (12 hours)). Between the two genes with shortest lags, CEBPa showed a better fit with external input
A new gene was introduced into the initial window using a One Gene In, One Gene Out (OIOO) rule. A new gene among the remaining genes outside the window with the highest correlation with any gene inside the current window was identified while the gene least correlated with the new gene was discarded. By keeping at least one gene and its associated interactions from the previous window, we limited the number of possible regulatory relationships with the new gene(s). If introducing a new gene into the window formed an earlier window, the rule was relaxed to include the gene with the next highest degree of correlation with the genes in the window. Window #1 was thus advanced by replacing gene XDH with gene CEBPg as correlation of CEBPg with KLF4 (
For the new genes in the newly created windows, regulatory relationships were inferred while retaining genes and their associations from previous windows. For example, in window #3, the regulatory relation of the new gene in the window, CEBPg, was tested taking into account gene relationships to KLF4 and CEBPb from window #1. The time-course expression profiles of genes in window #3 indicated a noticeable lag for CEBPg when compared to genes KL4 and CEBPb (Figure
Windows of 7-gene adipogenesis network. All windows covering the 7-gene network are shown. Newly inferred gene interactions inside the window are indicated by broken arrows while interactions inferred from a previous window are indicated by solid arrows. Window #5 did not have any broken arrows connecting genes because no new gene relationships were inferred; the null hypothesis was the optimal regulatory relationship connecting genes. Furthermore, windows contributing gene relationships to other windows are shown by solid arrows between windows.
The cumulative adipogenesis network inferred by MANI through the 5 windows is shown in Figure
Dynamic adipogenesis network constructed by MANI. The two inputs of the network were
An objective validation of MANI’s performance in network inference was conducted using time-series expression data made available as part of the DREAM3 challenge (Supplementary Figure S2). This data was generated by the challenge organizers by perturbing an in silico network of 10 genes derived from
Comparison of MANI inferred DREAM3 network with the correct answer. (a) The 10-gene DREAM3 network that was perturbed by DREAM3 organizers to produce the time- series data. (b) Network inference by MANI.
Since our goal was to infer a sparse network and MANI inferred 10 edges between genes, the top 10 edges inferred by each of the methods were used for comparison. Table
Network inference performance of MANI and other methods.
Parameters of assessment | ANOVerence | CLR | MANI |
---|---|---|---|
| 27.27% | 27.27% | 36.36% |
| 79.41% | 79.41% | 82.35% |
| 30% | 30% | 40% |
Gene expression data are generated in biological experiments at an increasing rate for the purpose of studying complex gene regulatory mechanisms and human disease mechanisms [
In contrast, MANI adopts a systematic and gradual approach to network inference by constructing networks within local modules. This local approach to network inference adopted by MANI allowed the final constructed adipogenesis network (Figure
We note that the current MANI approach also has several limitations. Inference of hierarchy in the network relies on differences in lags between the expressions of different genes. Therefore, lack of differences in lags between genes hinders MANI’s ability to infer regulatory relationships between genes. Relationships between genes G1, G2, G5, and G8 in the DREAM3 network (Figure
The authors declare no conflict of interests.
The work described in here was supported by a grant from the National Institutes of Health [R01 HL081690]. Mr. Joe Simard at the Software Site Licensing Department at the University of Virginia Information Technology Services helped to set up the SIMBIOLGY toolbox for use within the authors’ lab MATLAB platform. Assistance in using the toolbox to optimize parameters of ODE models was provided by MathWorks staff (Mr. Scott Benway and Ms. Josephine Dula).