Descriptor: "Gene regulatory network inference" / Language: english - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Gene regulatory network inference"' showing total 138 results

Start Over Descriptor "Gene regulatory network inference" Language english

138 results on '"Gene regulatory network inference"'

1. Network-based analysis of heterogeneous patient-matched brain and extracranial melanoma metastasis pairs reveals three homogeneous subgroups

Author: Konrad Grützmann, Theresa Kraft, Matthias Meinhardt, Friedegund Meier, Dana Westphal, and Michael Seifert
Subjects: Computational cancer biology, Melanoma metastasis, Gene regulatory network inference, Network-based impact propagation, Personalized network-based gene expression and promoter methylation data analysis, Biotechnology, TP248.13-248.65
Abstract: Melanoma, the deadliest form of skin cancer, can metastasize to different organs. Molecular differences between brain and extracranial melanoma metastases are poorly understood. Here, promoter methylation and gene expression of 11 heterogeneous patient-matched pairs of brain and extracranial metastases were analyzed using melanoma-specific gene regulatory networks learned from public transcriptome and methylome data followed by network-based impact propagation of patient-specific alterations. This innovative data analysis strategy allowed to predict potential impacts of patient-specific driver candidate genes on other genes and pathways. The patient-matched metastasis pairs clustered into three robust subgroups with specific downstream targets with known roles in cancer, including melanoma (SG1: RBM38, BCL11B, SG2: GATA3, FES, SG3: SLAMF6, PYCARD). Patient subgroups and ranking of target gene candidates were confirmed in a validation cohort. Summarizing, computational network-based impact analyses of heterogeneous metastasis pairs predicted individual regulatory differences in melanoma brain metastases, cumulating into three consistent subgroups with specific downstream target genes.
Published: 2024
Full Text: View/download PDF

2. CVGAE: A Self-Supervised Generative Method for Gene Regulatory Network Inference Using Single-Cell RNA Sequencing Data.

Author: Liu, Wei, Teng, Zhijie, Li, Zejun, and Chen, Jing
Subjects: GRAPH neural networks, MULTILAYER perceptrons, GENE expression, RNA sequencing, GAUSSIAN distribution
Abstract: Gene regulatory network (GRN) inference based on single-cell RNA sequencing data (scRNAseq) plays a crucial role in understanding the regulatory mechanisms between genes. Various computational methods have been employed for GRN inference, but their performance in terms of network accuracy and model generalization is not satisfactory, and their poor performance is caused by high-dimensional data and network sparsity. In this paper, we propose a self-supervised method for gene regulatory network inference using single-cell RNA sequencing data (CVGAE). CVGAE uses graph neural network for inductive representation learning, which merges gene expression data and observed topology into a low-dimensional vector space. The well-trained vectors will be used to calculate mathematical distance of each gene, and further predict interactions between genes. In overall framework, FastICA is implemented to relief computational complexity caused by high dimensional data, and CVGAE adopts multi-stacked GraphSAGE layers as an encoder and an improved decoder to overcome network sparsity. CVGAE is evaluated on several single cell datasets containing four related ground-truth networks, and the result shows that CVGAE achieve better performance than comparative methods. To validate learning and generalization capabilities, CVGAE is applied in few-shot environment by change the ratio of train set and test set. In condition of few-shot, CVGAE obtains comparable or superior performance. CVGAE utilizes a beta variational autoencoder framework in conjunction with graph neural networks to characterize the underlying gene regulatory networks in single-cell gene expression data. The model employs multiple stacked SAGE layers to produce embedding representations of domain nodes, ensuring that the vector representation adheres to a multivariate Gaussian distribution. CVGAE leverages further convolutional computation and multi-layer perceptrons to determine the strength of interactions between nodes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. TopoDoE: a design of experiment strategy for selection and refinement in ensembles of executable gene regulatory networks

Author: Matteo Bouvier, Souad Zreika, Elodie Vallin, Camille Fourneaux, Sandrine Gonin-Giraud, Arnaud Bonnaffoux, and Olivier Gandrillon
Subjects: Gene regulatory network inference, Executable GRN, GRN simulation, GRN ensemble, Design of experiment, Perturbation experiment, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Inference of Gene Regulatory Networks (GRNs) is a difficult and long-standing question in Systems Biology. Numerous approaches have been proposed with the latest methods exploring the richness of single-cell data. One of the current difficulties lies in the fact that many methods of GRN inference do not result in one proposed GRN but in a collection of plausible networks that need to be further refined. In this work, we present a Design of Experiment strategy to use as a second stage after the inference process. It is specifically fitted for identifying the next most informative experiment to perform for deciding between multiple network topologies, in the case where proposed GRNs are executable models. This strategy first performs a topological analysis to reduce the number of perturbations that need to be tested, then predicts the outcome of the retained perturbations by simulation of the GRNs and finally compares predictions with novel experimental data. Results We apply this method to the results of our divide-and-conquer algorithm called WASABI, adapt its gene expression model to produce perturbations and compare our predictions with experimental results. We show that our networks were able to produce in silico predictions on the outcome of a gene knock-out, which were qualitatively validated for 48 out of 49 genes. Finally, we eliminate as many as two thirds of the candidate networks for which we could identify an incorrect topology, thus greatly improving the accuracy of our predictions. Conclusion These results both confirm the inference accuracy of WASABI and show how executable gene expression models can be leveraged to further refine the topology of inferred GRNs. We hope this strategy will help systems biologists further explore their data and encourage the development of more executable GRN models.
Published: 2024
Full Text: View/download PDF

4. TopoDoE: a design of experiment strategy for selection and refinement in ensembles of executable gene regulatory networks.

Author: Bouvier, Matteo, Zreika, Souad, Vallin, Elodie, Fourneaux, Camille, Gonin-Giraud, Sandrine, Bonnaffoux, Arnaud, and Gandrillon, Olivier
Subjects: *SYSTEMS biology, *GENE expression, *GENE regulatory networks
Abstract: Background: Inference of Gene Regulatory Networks (GRNs) is a difficult and long-standing question in Systems Biology. Numerous approaches have been proposed with the latest methods exploring the richness of single-cell data. One of the current difficulties lies in the fact that many methods of GRN inference do not result in one proposed GRN but in a collection of plausible networks that need to be further refined. In this work, we present a Design of Experiment strategy to use as a second stage after the inference process. It is specifically fitted for identifying the next most informative experiment to perform for deciding between multiple network topologies, in the case where proposed GRNs are executable models. This strategy first performs a topological analysis to reduce the number of perturbations that need to be tested, then predicts the outcome of the retained perturbations by simulation of the GRNs and finally compares predictions with novel experimental data. Results: We apply this method to the results of our divide-and-conquer algorithm called WASABI, adapt its gene expression model to produce perturbations and compare our predictions with experimental results. We show that our networks were able to produce in silico predictions on the outcome of a gene knock-out, which were qualitatively validated for 48 out of 49 genes. Finally, we eliminate as many as two thirds of the candidate networks for which we could identify an incorrect topology, thus greatly improving the accuracy of our predictions. Conclusion: These results both confirm the inference accuracy of WASABI and show how executable gene expression models can be leveraged to further refine the topology of inferred GRNs. We hope this strategy will help systems biologists further explore their data and encourage the development of more executable GRN models. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. PMF-GRN: a variational inference approach to single-cell gene regulatory network inference using probabilistic matrix factorization

Author: Claudia Skok Gibbs, Omar Mahmood, Richard Bonneau, and Kyunghyun Cho
Subjects: Probabilistic matrix factorization, Variational inference, Gene regulatory network inference, Single cell, Gene expression, Biology (General), QH301-705.5, Genetics, QH426-470
Abstract: Abstract Inferring gene regulatory networks (GRNs) from single-cell data is challenging due to heuristic limitations. Existing methods also lack estimates of uncertainty. Here we present Probabilistic Matrix Factorization for Gene Regulatory Network Inference (PMF-GRN). Using single-cell expression data, PMF-GRN infers latent factors capturing transcription factor activity and regulatory relationships. Using variational inference allows hyperparameter search for principled model selection and direct comparison to other generative models. We extensively test and benchmark our method using real single-cell datasets and synthetic data. We show that PMF-GRN infers GRNs more accurately than current state-of-the-art single-cell GRN inference methods, offering well-calibrated uncertainty estimates.
Published: 2024
Full Text: View/download PDF

6. Studying temporal dynamics of single cells: expression, lineage and regulatory networks.

Author: Pan, Xinhai and Zhang, Xiuwei
Abstract: Learning how multicellular organs are developed from single cells to different cell types is a fundamental problem in biology. With the high-throughput scRNA-seq technology, computational methods have been developed to reveal the temporal dynamics of single cells from transcriptomic data, from phenomena on cell trajectories to the underlying mechanism that formed the trajectory. There are several distinct families of computational methods including Trajectory Inference (TI), Lineage Tracing (LT), and Gene Regulatory Network (GRN) Inference which are involved in such studies. This review summarizes these computational approaches which use scRNA-seq data to study cell differentiation and cell fate specification as well as the advantages and limitations of different methods. We further discuss how GRNs can potentially affect cell fate decisions and trajectory structures. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Gene regulatory network inference based on causal discovery integrating with graph neural network.

Author: Feng, Ke, Jiang, Hongyang, Yin, Chaoyi, and Sun, Huiyan
Subjects: *GENE regulatory networks, *ARTIFICIAL neural networks, *GENE expression, *GENETIC regulation, *BIOLOGICAL networks
Abstract: Gene regulatory network (GRN) inference from gene expression data is a significant approach to understanding aspects of the biological system. Compared with generalized correlation‐based methods, causality‐inspired ones seem more rational to infer regulatory relationships. We propose GRINCD, a novel GRN inference framework empowered by graph representation learning and causal asymmetric learning, considering both linear and non‐linear regulatory relationships. First, high‐quality representation of each gene is generated using graph neural network. Then, we apply the additive noise model to predict the causal regulation of each regulator‐target pair. Additionally, we design two channels and finally assemble them for robust prediction. Through comprehensive comparisons of our framework with state‐of‐the‐art methods based on different principles on numerous datasets of diverse types and scales, the experimental results show that our framework achieves superior or comparable performance under various evaluation metrics. Our work provides a new clue for constructing GRNs, and our proposed framework GRINCD also shows potential in identifying key factors affecting cancer development. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

8. PMF-GRN: a variational inference approach to single-cell gene regulatory network inference using probabilistic matrix factorization

Author: Skok Gibbs, Claudia, Mahmood, Omar, Bonneau, Richard, and Cho, Kyunghyun
Published: 2024
Full Text: View/download PDF

9. Gene regulation network inference using k-nearest neighbor-based mutual information estimation: revisiting an old DREAM

Author: Lior I. Shachaf, Elijah Roberts, Patrick Cahan, and Jie Xiao
Subjects: Gene regulatory network inference, Mutual information, k-nearest neighbor, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background A cell exhibits a variety of responses to internal and external cues. These responses are possible, in part, due to the presence of an elaborate gene regulatory network (GRN) in every single cell. In the past 20 years, many groups worked on reconstructing the topological structure of GRNs from large-scale gene expression data using a variety of inference algorithms. Insights gained about participating players in GRNs may ultimately lead to therapeutic benefits. Mutual information (MI) is a widely used metric within this inference/reconstruction pipeline as it can detect any correlation (linear and non-linear) between any number of variables (n-dimensions). However, the use of MI with continuous data (for example, normalized fluorescence intensity measurement of gene expression levels) is sensitive to data size, correlation strength and underlying distributions, and often requires laborious and, at times, ad hoc optimization. Results In this work, we first show that estimating MI of a bi- and tri-variate Gaussian distribution using k-nearest neighbor (kNN) MI estimation results in significant error reduction as compared to commonly used methods based on fixed binning. Second, we demonstrate that implementing the MI-based kNN Kraskov–Stoögbauer–Grassberger (KSG) algorithm leads to a significant improvement in GRN reconstruction for popular inference algorithms, such as Context Likelihood of Relatedness (CLR). Finally, through extensive in-silico benchmarking we show that a new inference algorithm CMIA (Conditional Mutual Information Augmentation), inspired by CLR, in combination with the KSG-MI estimator, outperforms commonly used methods. Conclusions Using three canonical datasets containing 15 synthetic networks, the newly developed method for GRN reconstruction—which combines CMIA, and the KSG-MI estimator—achieves an improvement of 20–35% in precision-recall measures over the current gold standard in the field. This new method will enable researchers to discover new gene interactions or better choose gene candidates for experimental validations.
Published: 2023
Full Text: View/download PDF

10. Ensemble Learning Based Gene Regulatory Network Inference.

Author: Peignier, Sergio, Sorin, Baptiste, and Calevro, Federica
Subjects: *GENE regulatory networks, *COMPUTATIONAL biology, *SPACE exploration, *GENETIC algorithms, *MACHINE learning
Abstract: In the machine learning field, the technique known as ensemble learning aims at combining different base learners in order to increase the quality and the robustness of the predictions. Indeed, this approach has widely been applied to tackle, with success, real world problems from different domains, including computational biology. Nevertheless, despite their potential, ensembles combining results from different base learners have been understudied in the context of gene regulatory network inference. In this paper we applied genetic algorithms and frequent itemset mining, to design small but effective ensembles of gene regulatory network inference methods. These ensembles were evaluated and compared to well-established single and ensemble methods, on both real and synthetic datasets. Results showed that small ensembles, consisting of few but diverse base learners, enhance the exploration of the solution space, and compensate base learners biases, outperforming state-of-the-art methods. Results advocate for the use of such methods as gene regulatory network inference tools. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

11. A novel Boolean network inference strategy to model early hematopoiesis aging

Author: Léonard Hérault, Mathilde Poplineau, Estelle Duprez, and Élisabeth Remy
Subjects: Aging, Hematopoietic stem cells, Single-cell RNA seq, Gene regulatory network Inference, Boolean modelling, Biotechnology, TP248.13-248.65
Abstract: Hematopoietic stem cell (HSC) aging is a multifactorial event leading to changes in HSC properties and functions, which are intrinsically coordinated and affect the early hematopoiesis. To better understand the mechanisms and factors controlling these changes, we developed an original strategy to construct a Boolean model of HSC differentiation. Based on our previous scRNA-seq data, we exhaustively characterized active transcription modules or regulons along the differentiation trajectory and constructed an influence graph between 15 selected components involved in the dynamics of the process. Then we defined dynamical constraints between observed cellular states along the trajectory and using answer set programming with in silico perturbation analysis, we obtained a Boolean model explaining the early priming of HSCs. Finally, perturbations of the model based on age-related changes revealed important deregulations, such as the overactivation of Egr1 and Junb or the loss of Cebpa activation by Gata2. These new regulatory mechanisms were found to be relevant for the myeloid bias of aged HSC and explain the decreased transcriptional priming of HSCs to all mature cell types except megakaryocytes.
Published: 2023
Full Text: View/download PDF

12. Gene regulation network inference using k-nearest neighbor-based mutual information estimation: revisiting an old DREAM.

Author: Shachaf, Lior I., Roberts, Elijah, Cahan, Patrick, and Xiao, Jie
Subjects: *GENETIC regulation, *GENE regulatory networks, *K-nearest neighbor classification, *GENE expression, *GAUSSIAN distribution
Abstract: Background: A cell exhibits a variety of responses to internal and external cues. These responses are possible, in part, due to the presence of an elaborate gene regulatory network (GRN) in every single cell. In the past 20 years, many groups worked on reconstructing the topological structure of GRNs from large-scale gene expression data using a variety of inference algorithms. Insights gained about participating players in GRNs may ultimately lead to therapeutic benefits. Mutual information (MI) is a widely used metric within this inference/reconstruction pipeline as it can detect any correlation (linear and non-linear) between any number of variables (n-dimensions). However, the use of MI with continuous data (for example, normalized fluorescence intensity measurement of gene expression levels) is sensitive to data size, correlation strength and underlying distributions, and often requires laborious and, at times, ad hoc optimization. Results: In this work, we first show that estimating MI of a bi- and tri-variate Gaussian distribution using k-nearest neighbor (kNN) MI estimation results in significant error reduction as compared to commonly used methods based on fixed binning. Second, we demonstrate that implementing the MI-based kNN Kraskov–Stoögbauer–Grassberger (KSG) algorithm leads to a significant improvement in GRN reconstruction for popular inference algorithms, such as Context Likelihood of Relatedness (CLR). Finally, through extensive in-silico benchmarking we show that a new inference algorithm CMIA (Conditional Mutual Information Augmentation), inspired by CLR, in combination with the KSG-MI estimator, outperforms commonly used methods. Conclusions: Using three canonical datasets containing 15 synthetic networks, the newly developed method for GRN reconstruction—which combines CMIA, and the KSG-MI estimator—achieves an improvement of 20–35% in precision-recall measures over the current gold standard in the field. This new method will enable researchers to discover new gene interactions or better choose gene candidates for experimental validations. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

13. GReNaDIne: A Data-Driven Python Library to Infer Gene Regulatory Networks from Gene Expression Data.

Author: Schmitt, Pauline, Sorin, Baptiste, Frouté, Timothée, Parisot, Nicolas, Calevro, Federica, and Peignier, Sergio
Subjects: *GENE regulatory networks, *PYTHON programming language, *GENE expression, *GENE libraries, *SYSTEMS biology, *PROGRAMMING languages
Abstract: Context: Inferring gene regulatory networks (GRN) from high-throughput gene expression data is a challenging task for which different strategies have been developed. Nevertheless, no ever-winning method exists, and each method has its advantages, intrinsic biases, and application domains. Thus, in order to analyze a dataset, users should be able to test different techniques and choose the most appropriate one. This step can be particularly difficult and time consuming, since most methods' implementations are made available independently, possibly in different programming languages. The implementation of an open-source library containing different inference methods within a common framework is expected to be a valuable toolkit for the systems biology community. Results: In this work, we introduce GReNaDIne (Gene Regulatory Network Data-driven Inference), a Python package that implements 18 machine learning data-driven gene regulatory network inference methods. It also includes eight generalist preprocessing techniques, suitable for both RNA-seq and microarray dataset analysis, as well as four normalization techniques dedicated to RNA-seq. In addition, this package implements the possibility to combine the results of different inference tools to form robust and efficient ensembles. This package has been successfully assessed under the DREAM5 challenge benchmark dataset. The open-source GReNaDIne Python package is made freely available in a dedicated GitLab repository, as well as in the official third-party software repository PyPI Python Package Index. The latest documentation on the GReNaDIne library is also available at Read the Docs, an open-source software documentation hosting platform. Contribution: The GReNaDIne tool represents a technological contribution to the field of systems biology. This package can be used to infer gene regulatory networks from high-throughput gene expression data using different algorithms within the same framework. In order to analyze their datasets, users can apply a battery of preprocessing and postprocessing tools and choose the most adapted inference method from the GReNaDIne library and even combine the output of different methods to obtain more robust results. The results format provided by GReNaDIne is compatible with well-known complementary refinement tools such as PYSCENIC. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

14. A generic parallel framework for inferring large-scale gene regulatory networks from expression profiles: application to Alzheimer's disease network.

Author: Sebastian, Softya, Roy, Swarup, and Kalita, Jugal
Subjects: *GENE regulatory networks, *GENE expression, *ALZHEIMER'S disease, *PARALLEL programming, *NETWORK hubs, *TRANSGENIC mice
Abstract: The inference of large-scale gene regulatory networks is essential for understanding comprehensive interactions among genes. Most existing methods are limited to reconstructing networks with a few hundred nodes. Therefore, parallel computing paradigms must be leveraged to construct large networks. We propose a generic parallel framework that enables any existing method, without re-engineering, to infer large networks in parallel, guaranteeing quality output. The framework is tested on 15 inference methods (not limited to) employing in silico benchmarks and real-world large expression matrices, followed by qualitative and speedup assessment. The framework does not compromise the quality of the base serial inference method. We rank the candidate methods and use the top-performing method to infer an Alzheimer's Disease (AD) affected network from large expression profiles of a triple transgenic mouse model consisting of 45,101 genes. The resultant network is further explored to obtain hub genes that emerge functionally related to the disease. We partition the network into 41 modules and conduct pathway enrichment analysis, revealing that a good number of participating genes are collectively responsible for several brain disorders, including AD. Finally, we extract the interactions of a few known AD genes and observe that they are periphery genes connected to the network's hub genes. Availability : The R implementation of the framework is downloadable from https://github.com/Netralab/GenericParallelFramework. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

15. GRNMOPT: Inference of gene regulatory networks based on a multi-objective optimization approach.

Author: Dong H, Ma B, Meng Y, Wu Y, Liu Y, Zeng T, and Huang J
Abstract: Background and Objective: The reconstruction of gene regulatory networks (GRNs) stands as a vital approach in deciphering complex biological processes. The application of nonlinear ordinary differential equations (ODEs) models has demonstrated considerable efficacy in predicting GRNs. Notably, the decay rate and time delay are pivotal in authentic gene regulation, yet their systematic determination in ODEs models remains underexplored. The development of a comprehensive optimization framework for the effective estimation of these key parameters is essential for accurate GRN inference., Method: This study introduces GRNMOPT, an innovative methodology for inferring GRNs from time-series and steady-state data. GRNMOPT employs a combined use of decay rate and time delay in constructing ODEs models to authentically represent gene regulatory processes. It incorporates a multi-objective optimization approach, optimizing decay rate and time delay concurrently to derive Pareto optimal sets for these factors, thereby maximizing accuracy metrics such as AUROC (Area Under the Receiver Operating Characteristic curve) and AUPR (Area Under the Precision-Recall curve). Additionally, the use of XGBoost for calculating feature importance aids in identifying potential regulatory gene links., Results: Comprehensive experimental evaluations on two simulated datasets from DREAM4 and three real gene expression datasets (Yeast, In vivo Reverse-engineering and Modeling Assessment [IRMA], and Escherichia coli [E. coli]) reveal that GRNMOPT performs commendably across varying network scales. Furthermore, cross-validation experiments substantiate the robustness of GRNMOPT., Conclusion: We propose a novel approach called GRNMOPT to infer GRNs based on a multi-objective optimization framework, which effectively improves inference accuracy and provides a powerful tool for GRNs inference., Competing Interests: Declaration of Competing Interest All authors disclosed no relevant relationships., (Copyright © 2024 Elsevier Ltd. All rights reserved.)
Published: 2024
Full Text: View/download PDF

16. Inferring causal gene regulatory network via GreyNet: From dynamic grey association to causation

Author: Guangyi Chen and Zhi-Ping Liu
Subjects: gene regulatory network inference, dynamic grey association, adaptive sliding window, causation, machine learning, Biotechnology, TP248.13-248.65
Abstract: Gene regulatory network (GRN) provides abundant information on gene interactions, which contributes to demonstrating pathology, predicting clinical outcomes, and identifying drug targets. Existing high-throughput experiments provide rich time-series gene expression data to reconstruct the GRN to further gain insights into the mechanism of organisms responding to external stimuli. Numerous machine-learning methods have been proposed to infer gene regulatory networks. Nevertheless, machine learning, especially deep learning, is generally a “black box,” which lacks interpretability. The causality has not been well recognized in GRN inference procedures. In this article, we introduce grey theory integrated with the adaptive sliding window technique to flexibly capture instant gene–gene interactions in the uncertain regulatory system. Then, we incorporate generalized multivariate Granger causality regression methods to transform the dynamic grey association into causation to generate directional regulatory links. We evaluate our model on the DREAM4 in silico benchmark dataset and real-world hepatocellular carcinoma (HCC) time-series data. We achieved competitive results on the DREAM4 compared with other state-of-the-art algorithms and gained meaningful GRN structure on HCC data respectively.
Published: 2022
Full Text: View/download PDF

17. Optimal Sparsity Selection Based on an Information Criterion for Accurate Gene Regulatory Network Inference.

Author: Seçilmiş, Deniz, Nelander, Sven, and Sonnhammer, Erik L. L.
Abstract: Accurate inference of gene regulatory networks (GRNs) is important to unravel unknown regulatory mechanisms and processes, which can lead to the identification of treatment targets for genetic diseases. A variety of GRN inference methods have been proposed that, under suitable data conditions, perform well in benchmarks that consider the entire spectrum of false-positives and -negatives. However, it is very challenging to predict which single network sparsity gives the most accurate GRN. Lacking criteria for sparsity selection, a simplistic solution is to pick the GRN that has a certain number of links per gene, which is guessed to be reasonable. However, this does not guarantee finding the GRN that has the correct sparsity or is the most accurate one. In this study, we provide a general approach for identifying the most accurate and sparsity-wise relevant GRN within the entire space of possible GRNs. The algorithm, called SPA, applies a "GRN information criterion" (GRNIC) that is inspired by two commonly used model selection criteria, Akaike and Bayesian Information Criterion (AIC and BIC) but adapted to GRN inference. The results show that the approach can, in most cases, find the GRN whose sparsity is close to the true sparsity and close to as accurate as possible with the given GRN inference method and data. The datasets and source code can be found at https://bitbucket.org/sonnhammergrni/spa/. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

18. Optimal Sparsity Selection Based on an Information Criterion for Accurate Gene Regulatory Network Inference

Author: Deniz Seçilmiş, Sven Nelander, and Erik L. L. Sonnhammer
Subjects: sparsity selection, information criteria, gene regulatory network inference, gene expression data, noise in gene expression, Genetics, QH426-470
Abstract: Accurate inference of gene regulatory networks (GRNs) is important to unravel unknown regulatory mechanisms and processes, which can lead to the identification of treatment targets for genetic diseases. A variety of GRN inference methods have been proposed that, under suitable data conditions, perform well in benchmarks that consider the entire spectrum of false-positives and -negatives. However, it is very challenging to predict which single network sparsity gives the most accurate GRN. Lacking criteria for sparsity selection, a simplistic solution is to pick the GRN that has a certain number of links per gene, which is guessed to be reasonable. However, this does not guarantee finding the GRN that has the correct sparsity or is the most accurate one. In this study, we provide a general approach for identifying the most accurate and sparsity-wise relevant GRN within the entire space of possible GRNs. The algorithm, called SPA, applies a “GRN information criterion” (GRNIC) that is inspired by two commonly used model selection criteria, Akaike and Bayesian Information Criterion (AIC and BIC) but adapted to GRN inference. The results show that the approach can, in most cases, find the GRN whose sparsity is close to the true sparsity and close to as accurate as possible with the given GRN inference method and data. The datasets and source code can be found at https://bitbucket.org/sonnhammergrni/spa/.
Published: 2022
Full Text: View/download PDF

19. Inferring and analyzing gene regulatory networks from multi-factorial expression data: a complete and interactive suite

Author: Océane Cassan, Sophie Lèbre, and Antoine Martin
Subjects: Gene regulatory network inference, Graphical user interface, Multifactorial transcriptomic analysis, Model-based clustering, Analysis workflow, Biotechnology, TP248.13-248.65, Genetics, QH426-470
Abstract: Abstract Background High-throughput transcriptomic datasets are often examined to discover new actors and regulators of a biological response. To this end, graphical interfaces have been developed and allow a broad range of users to conduct standard analyses from RNA-seq data, even with little programming experience. Although existing solutions usually provide adequate procedures for normalization, exploration or differential expression, more advanced features, such as gene clustering or regulatory network inference, often miss or do not reflect current state of the art methodologies. Results We developed here a user interface called DIANE (Dashboard for the Inference and Analysis of Networks from Expression data) designed to harness the potential of multi-factorial expression datasets from any organisms through a precise set of methods. DIANE interactive workflow provides normalization, dimensionality reduction, differential expression and ontology enrichment. Gene clustering can be performed and explored via configurable Mixture Models, and Random Forests are used to infer gene regulatory networks. DIANE also includes a novel procedure to assess the statistical significance of regulator-target influence measures based on permutations for Random Forest importance metrics. All along the pipeline, session reports and results can be downloaded to ensure clear and reproducible analyses. Conclusions We demonstrate the value and the benefits of DIANE using a recently published data set describing the transcriptional response of Arabidopsis thaliana under the combination of temperature, drought and salinity perturbations. We show that DIANE can intuitively carry out informative exploration and statistical procedures with RNA-Seq data, perform model based gene expression profiles clustering and go further into gene network reconstruction, providing relevant candidate genes or signalling pathways to explore. DIANE is available as a web service ( https://diane.bpmp.inrae.fr ), or can be installed and locally launched as a complete R package.
Published: 2021
Full Text: View/download PDF

20. PFBNet: a priori-fused boosting method for gene regulatory network inference

Author: Dandan Che, Shun Guo, Qingshan Jiang, and Lifei Chen
Subjects: Gene regulatory network inference, Time-series expression data, Boosting, Prior information fusion, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Inferring gene regulatory networks (GRNs) from gene expression data remains a challenge in system biology. In past decade, numerous methods have been developed for the inference of GRNs. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of potential interactions. Results We present a novel method, namely priori-fused boosting network inference method (PFBNet), to infer GRNs from time-series expression data by using the non-linear model of Boosting and the prior information (e.g., the knockout data) fusion scheme. Specifically, PFBNet first calculates the confidences of the regulation relationships using the boosting-based model, where the information about the accumulation impact of the gene expressions at previous time points is taken into account. Then, a newly defined strategy is applied to fuse the information from the prior data by elevating the confidences of the regulation relationships from the corresponding regulators. Conclusions The experiments on the benchmark datasets from DREAM challenge as well as the E.c o l i datasets show that PFBNet achieves significantly better performance than other state-of-the-art methods (Jump3, GEINE3-lag, HiDi, iRafNet and BiXGBoost).
Published: 2020
Full Text: View/download PDF

21. Integration of single-cell multi-omics for gene regulatory network inference

Author: Xinlin Hu, Yaohua Hu, Fanjie Wu, Ricky Wai Tak Leung, and Jing Qin
Subjects: Single-cell sequencing, Gene regulatory network inference, Single-cell multi-omics integration, Biotechnology, TP248.13-248.65
Abstract: The advancement of single-cell sequencing technology in recent years has provided an opportunity to reconstruct gene regulatory networks (GRNs) with the data from thousands of single cells in one sample. This uncovers regulatory interactions in cells and speeds up the discoveries of regulatory mechanisms in diseases and biological processes. Therefore, more methods have been proposed to reconstruct GRNs using single-cell sequencing data. In this review, we introduce technologies for sequencing single-cell genome, transcriptome, and epigenome. At the same time, we present an overview of current GRN reconstruction strategies utilizing different single-cell sequencing data. Bioinformatics tools were grouped by their input data type and mathematical principles for reader's convenience, and the fundamental mathematics inherent in each group will be discussed. Furthermore, the adaptabilities and limitations of these different methods will also be summarized and compared, with the hope to facilitate researchers recognizing the most suitable tools for them.
Published: 2020
Full Text: View/download PDF

22. Data-driven Gene Regulatory Networks Inference Based on Classification Algorithms.

Author: Peignier, Sergio, Schmitt, Pauline, and Calevro, Federica
Subjects: *CLASSIFICATION algorithms, *SYSTEMS biology, *GENETIC regulation, *GENE expression, *GENE regulatory networks
Abstract: Inferring Gene Regulatory Networks from high-throughput gene expression data is a challenging problem, addressed by the systems biology community. Most approaches that aim at unraveling the gene regulation mechanisms in a data-driven way, analyze gene expression datasets to score potential regulatory links between transcription factors and target genes. So far, three major families of approaches have been proposed to score regulatory links. These methods rely respectively on correlation measures, mutual information metrics, and regression algorithms. In this paper we present a new family of data-driven inference methods. This new family, inspired by the regression-based paradigm, relies on the use of classification algorithms. This paper assesses and advocates for the use of this paradigm as a new promising approach to infer gene regulatory networks. Indeed, the development and assessment of five new inference methods based on well-known classification algorithms shows that the classification-based inference family exhibits good results when compared to well-established paradigms. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

23. CONFIGURE: A pipeline for identifying context specific regulatory modules from gene expression data and its application to breast cancer

Author: Sungjoon Park, Doyeong Hwang, Yoon Sun Yeo, Hyunggee Kim, and Jaewoo Kang
Subjects: Context specific regulatory module, Gene regulatory network inference, Single sample GSEA, Feature importance score, Breast cancer subtype, Internal medicine, RC31-1245, Genetics, QH426-470
Abstract: Abstract Background Gene expression data is widely used for identifying subtypes of diseases such as cancer. Differentially expressed gene analysis and gene set enrichment analysis are widely used for identifying biological mechanisms at the gene level and gene set level, respectively. However, the results of differentially expressed gene analysis are difficult to interpret and gene set enrichment analysis does not consider the interactions among genes in a gene set. Results We present CONFIGURE, a pipeline that identifies context specific regulatory modules from gene expression data. First, CONFIGURE takes gene expression data and context label information as inputs and constructs regulatory modules. Then, CONFIGURE makes a regulatory module enrichment score (RMES) matrix of enrichment scores of the regulatory modules on samples using the single-sample GSEA method. CONFIGURE calculates the importance scores of the regulatory modules on each context to rank the regulatory modules. We evaluated CONFIGURE on the Cancer Genome Atlas (TCGA) breast cancer RNA-seq dataset to determine whether it can produce biologically meaningful regulatory modules for breast cancer subtypes. We first evaluated whether RMESs are useful for differentiating breast cancer subtypes using a multi-class classifier and one-vs-rest binary SVM classifiers. The multi-class and one-vs-rest binary classifiers were trained using the RMESs as features and outperformed baseline classifiers. Furthermore, we conducted literature surveys on the basal-like type specific regulatory modules obtained by CONFIGURE and showed that highly ranked modules were associated with the phenotypes of basal-like type breast cancers. Conclusions We showed that enrichment scores of regulatory modules are useful for differentiating breast cancer subtypes and validated the basal-like type specific regulatory modules by literature surveys. In doing so, we found regulatory module candidates that have not been reported in previous literature. This demonstrates that CONFIGURE can be used to predict novel regulatory markers which can be validated by downstream wet lab experiments. We validated CONFIGURE on the breast cancer RNA-seq dataset in this work but CONFIGURE can be applied to any gene expression dataset containing context information.
Published: 2019
Full Text: View/download PDF

24. Inferring and analyzing gene regulatory networks from multi-factorial expression data: a complete and interactive suite.

Author: Cassan, Océane, Lèbre, Sophie, and Martin, Antoine
Subjects: *GENE regulatory networks, *GENE expression profiling, *RANDOM forest algorithms, *WEB services, *USER interfaces, *GRAPHICAL user interfaces
Abstract: Background: High-throughput transcriptomic datasets are often examined to discover new actors and regulators of a biological response. To this end, graphical interfaces have been developed and allow a broad range of users to conduct standard analyses from RNA-seq data, even with little programming experience. Although existing solutions usually provide adequate procedures for normalization, exploration or differential expression, more advanced features, such as gene clustering or regulatory network inference, often miss or do not reflect current state of the art methodologies. Results: We developed here a user interface called DIANE (Dashboard for the Inference and Analysis of Networks from Expression data) designed to harness the potential of multi-factorial expression datasets from any organisms through a precise set of methods. DIANE interactive workflow provides normalization, dimensionality reduction, differential expression and ontology enrichment. Gene clustering can be performed and explored via configurable Mixture Models, and Random Forests are used to infer gene regulatory networks. DIANE also includes a novel procedure to assess the statistical significance of regulator-target influence measures based on permutations for Random Forest importance metrics. All along the pipeline, session reports and results can be downloaded to ensure clear and reproducible analyses. Conclusions: We demonstrate the value and the benefits of DIANE using a recently published data set describing the transcriptional response of Arabidopsis thaliana under the combination of temperature, drought and salinity perturbations. We show that DIANE can intuitively carry out informative exploration and statistical procedures with RNA-Seq data, perform model based gene expression profiles clustering and go further into gene network reconstruction, providing relevant candidate genes or signalling pathways to explore. DIANE is available as a web service (https://diane.bpmp.inrae.fr), or can be installed and locally launched as a complete R package. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

25. Improving network inference algorithms using resampling methods

Author: Sean M Colby, Ryan S McClure, Christopher C Overall, Ryan S Renslow, and Jason E McDermott
Subjects: Gene regulatory network inference, Random subspace method, Resampling, Bootstrapping, Aggregation, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Relatively small changes to gene expression data dramatically affect co-expression networks inferred from that data which, in turn, can significantly alter the subsequent biological interpretation. This error propagation is an underappreciated problem that, while hinted at in the literature, has not yet been thoroughly explored. Resampling methods (e.g. bootstrap aggregation, random subspace method) are hypothesized to alleviate variability in network inference methods by minimizing outlier effects and distilling persistent associations in the data. But the efficacy of the approach assumes the generalization from statistical theory holds true in biological network inference applications. Results We evaluated the effect of bootstrap aggregation on inferred networks using commonly applied network inference methods in terms of stability, or resilience to perturbations in the underlying expression data, a metric for accuracy, and functional enrichment of edge interactions. Conclusion Bootstrap aggregation results in improved stability and, depending on the size of the input dataset, a marginal improvement to accuracy assessed by each method’s ability to link genes in the same functional pathway.
Published: 2018
Full Text: View/download PDF

26. PFBNet: a priori-fused boosting method for gene regulatory network inference.

Author: Che, Dandan, Guo, Shun, Jiang, Qingshan, and Chen, Lifei
Subjects: *GENE regulatory networks, *GENE expression
Abstract: Background: Inferring gene regulatory networks (GRNs) from gene expression data remains a challenge in system biology. In past decade, numerous methods have been developed for the inference of GRNs. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of potential interactions. Results: We present a novel method, namely priori-fused boosting network inference method (PFBNet), to infer GRNs from time-series expression data by using the non-linear model of Boosting and the prior information (e.g., the knockout data) fusion scheme. Specifically, PFBNet first calculates the confidences of the regulation relationships using the boosting-based model, where the information about the accumulation impact of the gene expressions at previous time points is taken into account. Then, a newly defined strategy is applied to fuse the information from the prior data by elevating the confidences of the regulation relationships from the corresponding regulators. Conclusions: The experiments on the benchmark datasets from DREAM challenge as well as the E.coli datasets show that PFBNet achieves significantly better performance than other state-of-the-art methods (Jump3, GEINE3-lag, HiDi, iRafNet and BiXGBoost). [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

27. tuxnet: a simple interface to process RNA sequencing data and infer gene regulatory networks.

Author: Spurney, Ryan J., Van den Broeck, Lisa, Clark, Natalie M., Fisher, Adam P., de Luis Balaguer, Maria A., and Sozzani, Rosangela
Subjects: *RNA sequencing, *GRAPHICAL user interfaces, *GENE regulatory networks, *GENETIC regulation, *GENETIC techniques, *GENE expression
Abstract: Summary: Predicting gene regulatory networks (GRNs) from expression profiles is a common approach for identifying important biological regulators. Despite the increased use of inference methods, existing computational approaches often do not integrate RNA‐sequencing data analysis, are not automated or are restricted to users with bioinformatics backgrounds. To address these limitations, we developed tuxnet, a user‐friendly platform that can process raw RNA‐sequencing data from any organism with an existing reference genome using a modified tuxedo pipeline (hisat 2 + cufflinks package) and infer GRNs from these processed data. tuxnet is implemented as a graphical user interface and can mine gene regulations, either by applying a dynamic Bayesian network (DBN) inference algorithm, genist, or a regression tree‐based pipeline, rtp‐star. We obtained time‐course expression data of a PERIANTHIA (PAN) inducible line and inferred a GRN using genist to illustrate the use of tuxnet while gaining insight into the regulations downstream of the Arabidopsis root stem cell regulator PAN. Using rtp‐star, we inferred the network of ATHB13, a downstream gene of PAN, for which we obtained wild‐type and mutant expression profiles. Additionally, we generated two networks using temporal data from developmental leaf data and spatial data from root cell‐type data to highlight the use of tuxnet to form new testable hypotheses from previously explored data. Our case studies feature the versatility of tuxnet when using different types of gene expression data to infer networks and its accessibility as a pipeline for non‐bioinformaticians to analyze transcriptome data, predict causal regulations, assess network topology and identify key regulators. Significance Statement: tuxnet offers a simple integrated interface for both computational and non‐computational biologists to perform RNA‐seq data analysis and infer GRNs from RNA‐seq data (https://rspurney.github.io/TuxNet/). By implementing network inference techniques, tuxnet allows for the prediction of causal regulations with high confidence and thus is a practical tool to evaluate and handle transcriptome data. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

28. PropaNet: Time-Varying Condition-Specific Transcriptional Network Construction by Network Propagation

Author: Hongryul Ahn, Kyuri Jo, Dabin Jeong, Minwoo Pak, Jihye Hur, Woosuk Jung, and Sun Kim
Subjects: transcription factor, gene regulatory network inference, time-varying, plant stress, influence maximization, network propagation, Plant culture, SB1-1110
Abstract: Transcription factor (TF) has a significant influence on the state of a cell by regulating multiple down-stream genes. Thus, experimental and computational biologists have made great efforts to construct TF gene networks for regulatory interactions between TFs and their target genes. Now, an important research question is how to utilize TF networks to investigate the response of a plant to stress at the transcription control level using time-series transcriptome data. In this article, we present a new computational network, PropaNet, to investigate dynamics of TF networks from time-series transcriptome data using two state-of-the-art network analysis techniques, influence maximization and network propagation. PropaNet uses the influence maximization technique to produce a ranked list of TFs, in the order of TF that explains differentially expressed genes (DEGs) better at each time point. Then, a network propagation technique is used to select a group of TFs that explains DEGs best as a whole. For the analysis of Arabidopsis time series datasets from AtGenExpress, we used PlantRegMap as a template TF network and performed PropaNet analysis to investigate transcriptional dynamics of Arabidopsis under cold and heat stress. The time varying TF networks showed that Arabidopsis responded to cold and heat stress quite differently. For cold stress, bHLH and bZIP type TFs were the first responding TFs and the cold signal influenced histone variants, various genes involved in cell architecture, osmosis and restructuring of cells. However, the consequences of plants under heat stress were up-regulation of genes related to accelerating differentiation and starting re-differentiation. In terms of energy metabolism, plants under heat stress show elevated metabolic process and resulting in an exhausted status. We believe that PropaNet will be useful for the construction of condition-specific time-varying TF network for time-series data analysis in response to stress. PropaNet is available at http://biohealth.snu.ac.kr/software/PropaNet.
Published: 2019
Full Text: View/download PDF

29. Network-based analysis of heterogeneous patient-matched brain and extracranial melanoma metastasis pairs reveals three homogeneous subgroups.

Author: Grützmann K, Kraft T, Meinhardt M, Meier F, Westphal D, and Seifert M
Abstract: Melanoma, the deadliest form of skin cancer, can metastasize to different organs. Molecular differences between brain and extracranial melanoma metastases are poorly understood. Here, promoter methylation and gene expression of 11 heterogeneous patient-matched pairs of brain and extracranial metastases were analyzed using melanoma-specific gene regulatory networks learned from public transcriptome and methylome data followed by network-based impact propagation of patient-specific alterations. This innovative data analysis strategy allowed to predict potential impacts of patient-specific driver candidate genes on other genes and pathways. The patient-matched metastasis pairs clustered into three robust subgroups with specific downstream targets with known roles in cancer, including melanoma (SG1: RBM38 , BCL11B , SG2: GATA3, FES , SG3: SLAMF6 , PYCARD ). Patient subgroups and ranking of target gene candidates were confirmed in a validation cohort. Summarizing, computational network-based impact analyses of heterogeneous metastasis pairs predicted individual regulatory differences in melanoma brain metastases, cumulating into three consistent subgroups with specific downstream target genes., Competing Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (© 2024 The Authors.)
Published: 2024
Full Text: View/download PDF

30. PropaNet: Time-Varying Condition-Specific Transcriptional Network Construction by Network Propagation.

Author: Ahn, Hongryul, Jo, Kyuri, Jeong, Dabin, Pak, Minwoo, Hur, Jihye, Jung, Woosuk, and Kim, Sun
Subjects: PHYSIOLOGICAL effects of heat, GENE regulatory networks, TIME series analysis, TIME-varying networks, ENERGY metabolism, BIOLOGISTS
Abstract: Transcription factor (TF) has a significant influence on the state of a cell by regulating multiple down-stream genes. Thus, experimental and computational biologists have made great efforts to construct TF gene networks for regulatory interactions between TFs and their target genes. Now, an important research question is how to utilize TF networks to investigate the response of a plant to stress at the transcription control level using time-series transcriptome data. In this article, we present a new computational network, PropaNet, to investigate dynamics of TF networks from time-series transcriptome data using two state-of-the-art network analysis techniques, influence maximization and network propagation. PropaNet uses the influence maximization technique to produce a ranked list of TFs, in the order of TF that explains differentially expressed genes (DEGs) better at each time point. Then, a network propagation technique is used to select a group of TFs that explains DEGs best as a whole. For the analysis of Arabidopsis time series datasets from AtGenExpress, we used PlantRegMap as a template TF network and performed PropaNet analysis to investigate transcriptional dynamics of Arabidopsis under cold and heat stress. The time varying TF networks showed that Arabidopsis responded to cold and heat stress quite differently. For cold stress, bHLH and bZIP type TFs were the first responding TFs and the cold signal influenced histone variants, various genes involved in cell architecture, osmosis and restructuring of cells. However, the consequences of plants under heat stress were up-regulation of genes related to accelerating differentiation and starting re-differentiation. In terms of energy metabolism, plants under heat stress show elevated metabolic process and resulting in an exhausted status. We believe that PropaNet will be useful for the construction of condition-specific time-varying TF network for time-series data analysis in response to stress. PropaNet is available at http://biohealth.snu.ac.kr/software/PropaNet. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

31. A guide to gene regulatory network inference for obtaining predictive solutions: Underlying assumptions and fundamental biological and data constraints.

Author: Barbosa, Sara, Niebel, Bastian, Wolf, Sebastian, Mauch, Klaus, and Takors, Ralf
Subjects: *GENE regulatory networks, *BIOLOGICAL systems, *BIOLOGICAL models, *SYSTEMS biology, *INFORMATION theory
Abstract: Abstract The study of biological systems at a system level has become a reality due to the increasing powerful computational approaches able to handle increasingly larger datasets. Uncovering the dynamic nature of gene regulatory networks in order to attain a system level understanding and improve the predictive power of biological models is an important research field in systems biology. The task itself presents several challenges, since the problem is of combinatorial nature and highly depends on several biological constraints and also the intended application. Given the intrinsic interdisciplinary nature of gene regulatory network inference, we present a review on the currently available approaches, their challenges and limitations. We propose guidelines to select the most appropriate method considering the underlying assumptions and fundamental biological and data constraints. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

32. Combination of gene regulatory networks and sequential machine learning for drug repurposing

Author: Réda, Clémence, Maladies neurodéveloppementales et neurovasculaires (NeuroDiderot (UMR_S_1141 / U1141)), Institut National de la Santé et de la Recherche Médicale (INSERM)-Université Paris Cité (UPCité), Scool (Scool), Inria Lille - Nord Europe, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), Université Paris Cité, Andrée Delahaye-Duriez, Emilie Kaufmann, and Réda, Clémence
Subjects: [INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], Apprentissage séquentiel, Sequential learning, Drug repurposing, Drug testing, [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], [MATH] Mathematics [math], Réseaux booléens, Boolean networks, Analyse de données transcriptionnelles, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Bandits à bras multiples, Essai de médicaments, Gene regulatory network inference, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Repositionnement de médicaments, [MATH.MATH-ST]Mathematics [math]/Statistics [math.ST], Inférence de réseaux de régulation génique, Transcriptomic data analysis, Multi-armed bandits, [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], [MATH]Mathematics [math], [MATH.MATH-ST] Mathematics [math]/Statistics [math.ST], [INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]
Abstract: Given the ever increasing cost of designing de novo molecules to target causes of diseases, and the huge amount of currently available biological data, the development of systematic explorative pipelines for drug development has become of paramount importance. In my thesis, I focused on drug repurposing, which is a paradigm that aims at identifying new therapeutic indications for known chemical compounds. Due to the already large collection of transcriptomic data -that is, related to protein production through the transcription of gene DNA sequences- which is publicly available, I investigated how to process in a transparent and controllable way this information about gene activity to screen molecules. The current state of research in drug development indicates that such generic approaches might considerably fasten the discovery of promising therapies, especially for neglected or rare diseases research. First, noting that transcriptomic measurements are the product of a complex dynamical system of co- and inter-gene activity regulations, I worked on integrating in an automated fashion diverse types of biological information in order to build a model of these regulations. That is where gene regulatory networks, and more specifically, Boolean networks, intervene. Such models are useful for both explaining observed transcription levels, and for predicting the result of gene activity perturbations through molecules. Second, these models allow online in silico drug testing. While using the predictive features of Boolean networks can be costly, the core assumption of this thesis is that, combining them with sequential learning algorithms, such as multi-armed bandits, might mitigate that effect, and help control the error rate in recommended therapeutic candidates. This is the drug testing procedure suggested throughout my PhD. The question of the proper integration of known side information about the chemical compounds into multi-armed bandits is crucial, and has also been investigated further. Finally, I applied part of my work to ranking different treatment protocols for neurorepair in the case of encephalopathy in premature infants. On the theoretical side, I also contributed to the design of an algorithm which is able to extend the drug testing procedure in a distributed way, for instance across several tested populations, disease models, or research teams., À cause du coût toujours croissant de la conception de molécules de novo ciblant les causes d'une maladie, et la masse considérable de données biologiques disponibles actuellement, la création de méthodes d'exploration systématique pour le développement de thérapies est devenue un enjeu crucial. Lors de ma thèse, je me suis concentrée sur le paradigme du repositionnement de médicaments, qui vise l'identification de nouvelles indications thérapeutiques pour des molécules chimiques connues. Étant donné la quantité déjà importante de données transcriptionnelles (relative à la production de protéines à travers la transcription des séquences ADN géniques) qui est accessible, j'ai cherché à déterminer comment traiter ces données sur l'activité génique de manière transparente et contrôlable pour le criblage de molécules. Une revue de l'état de la recherche en développement de médicaments montre que de telles approches génériques peuvent permettre de considérablement accélérer la découverte de thérapies prometteuses, plus particulièrement contre les maladies rares ou tropicales négligées. Premièrement, en remarquant que les mesures d'activité transcriptionnelle résultent d'un réseau dynamique complexe d'interactions coordonnées de régulation génique, j'ai travaillé sur l'intégration de façon automatique d'information biologique de formes variées afin de construire un modèle de ces régulations géniques. C'est là que les réseaux de régulation génique, et, plus spécifiquement, les réseaux booléens, interviennent. Ces modèles permettent à la fois d'expliquer les mesures d'origine transcriptionnelle observées, et de prédire le résultat de perturbations de l'activité de certains gènes par des molécules. Ensuite, ces modèles permettent d'effectuer des essais in silico de médicaments. Tandis que l'utilisation des prédictions faites par des réseaux booléens peut s'avérer coûteuse, l'hypothèse centrale de ma thèse est que leur combinaison avec des algorithmes d'apprentissage statistique séquentiels, comme les bandits à bras multiples, peuvent non seulement réduire ce coût, mais également contrôler le taux d'erreur dans les recommandations de candidats thérapeutiques. Cette démarche est la procédure d'essai clinique in silico analysée tout au long de mon travail de thèse. Le problème, capital, d'intégration des caractéristiques connues des composants chimiques dans les bandits à bras multiples a également été étudié plus en profondeur. Enfin, j'ai appliqué une partie de mon travail de thèse au classement de différents protocoles de traitement pour de la neuroréparation dans le cas d'encéphalopathies chez des enfants prématurés. D'un point de vue théorique, j'ai également contribué à la conception d'un algorithme qui permet d'étendre la procédure d'essai médicamenteux in silico à un cadre distribué à travers plusieurs populations test, modèles de maladies, ou équipes de recherche.
Published: 2022

33. Improving network inference algorithms using resampling methods.

Author: Colby, Sean M, McClure, Ryan S, Renslow, Ryan S, McDermott, Jason E, and Overall, Christopher C
Subjects: *BOOTSTRAP aggregation (Algorithms), *GENE regulatory networks, *GENE expression, *ESCHERICHIA coli, *RESAMPLING (Statistics), *AGGREGATION (Statistics), *BIOLOGICAL networks
Abstract: Background: Relatively small changes to gene expression data dramatically affect co-expression networks inferred from that data which, in turn, can significantly alter the subsequent biological interpretation. This error propagation is an underappreciated problem that, while hinted at in the literature, has not yet been thoroughly explored. Resampling methods (e.g. bootstrap aggregation, random subspace method) are hypothesized to alleviate variability in network inference methods by minimizing outlier effects and distilling persistent associations in the data. But the efficacy of the approach assumes the generalization from statistical theory holds true in biological network inference applications. Results: We evaluated the effect of bootstrap aggregation on inferred networks using commonly applied network inference methods in terms of stability, or resilience to perturbations in the underlying expression data, a metric for accuracy, and functional enrichment of edge interactions. Conclusion: Bootstrap aggregation results in improved stability and, depending on the size of the input dataset, a marginal improvement to accuracy assessed by each method's ability to link genes in the same functional pathway. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

34. Estimating drivers of cell state transitions using gene regulatory network models.

Author: Schlauch, Daniel, Glass, Kimberly, Hersh, Craig P., Silverman, Edwin K., and Quackenbush, John
Subjects: *GENE expression, *LUNG diseases, *OBSTRUCTIVE lung diseases, *GENE regulatory networks, *GENETIC regulation
Abstract: Background: Specific cellular states are often associated with distinct gene expression patterns. These states are plastic, changing during development, or in the transition from health to disease. One relatively simple extension of this concept is to recognize that we can classify different cell-types by their active gene regulatory networks and that, consequently, transitions between cellular states can be modeled by changes in these underlying regulatory networks. Results: Here we describe MONSTER, MOdeling Network State Transitions from Expression and Regulatory data, a regression-based method for inferring transcription factor drivers of cell state conditions at the gene regulatory network level. As a demonstration, we apply MONSTER to four different studies of chronic obstructive pulmonary disease to identify transcription factors that alter the network structure as the cell state progresses toward the disease-state. Conclusions: We demonstrate that MONSTER can find strong regulatory signals that persist across studies and tissues of the same disease and that are not detectable using conventional analysis methods based on differential expression. An R package implementing MONSTER is available at github.com/QuackenbushLab/MONSTER. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

35. PEAK: Integrating Curated and Noisy Prior Knowledge in Gene Regulatory Network Inference.

Author: Altarawy, Doaa, Eid, Fatma-Elzahraa, and Heath, Lenwood S.
Subjects: *GENE regulatory networks, *PRIOR learning, *GENETIC databases, *PREDICTION models, *COMPUTATIONAL biology
Abstract: With abundance of biological data, computational prediction of gene regulatory networks (GRNs) from gene expression data has become more feasible. Although incorporating other prior knowledge (PK), along with gene expression data, greatly improves prediction accuracy, the overall accuracy is still low. PK in GRN inference can be categorized into noisy and curated. In noisy PK, relations between genes do not necessarily correspond to regulatory relations and are thus considered inaccurate by inference algorithms such as transcription factor binding and protein-protein interactions. In contrast, curated PK is experimentally verified regulatory interactions in pathway databases. An issue in real data is that gene expression can poorly support the curated PK and thus most existing prediction algorithms cannot use these curated PK. Although several algorithms were proposed to incorporate noisy PK, none address curated PK with poor gene expression support. We present PEAK, a system to integrate both curated and noisy PK in GRN inference, especially with poor gene expression support. We introduce a novel method for GRN inference, C urI nf, to effectively integrate curated PK, even when the gene expression data poorly support the PK. PEAK also uses the previously proposed method Modified Elastic Net to incorporate noisy PK, and we call it N oisI nf. In our experiment, C urI nf significantly incorporates curated PK, which was regarded as noise by previous methods. Using 100% curated PK, C urI nf improves the area under precision-recall curve accuracy score over N oisI nf by 27.3% in synthetic data, 86.5% in Escherichia coli data, and 31.1% in Saccharomyces cerevisiae data. Moreover, even when the noise in PK is 10 times more than true PK, PEAK performs better than inference without any PK. Better integration of curated PK helps biologists benefit from verified experimental data to predict more reliable GRN. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

36. DTNI: a novel toxicogenomics data analysis tool for identifying the molecular mechanisms underlying the adverse effects of toxic compounds.

Author: Hendrickx, Diana, Souza, Terezinha, Jennen, Danyel, and Kleinjans, Jos
Subjects: *TOXICOGENOMICS, *DOSE-effect relationship in pharmacology, *GENE regulatory networks, *ADVERSE health care events, *TIME series analysis, *ORDINARY differential equations
Abstract: Unravelling gene regulatory networks (GRNs) influenced by chemicals is a major challenge in systems toxicology. Because toxicant-induced GRNs evolve over time and dose, the analysis of global gene expression data measured at multiple time points and doses will provide insight in the adverse effects of compounds. Therefore, there is a need for mathematical methods for GRN identification from time-over-dose-dependent data. One of the current approaches for GRN inference is Time Series Network Identification (TSNI). TSNI is based on ordinary differential equations (ODE), describing the time evolution of the expression of each gene, which is assumed to be dependent on the expression of other genes and an external perturbation (i.e. chemical exposure). Here, we present Dose-Time Network Identification (DTNI), a method extending TSNI by including ODE describing how the expression of each gene evolves with dose, which is supposed to depend on the expression of other genes and the exposure time. We also adapted TSNI in order to enable inclusion of time-over-dose-dependent data from multiple compounds. Here, we show that DTNI outperforms TSNI in inferring a toxicant-induced GRN. Moreover, we show that DTNI is a suitable method to infer a GRN dose- and time-dependently induced by a group of compounds influencing a common biological process. Applying DTNI on experimental data from TG-GATEs, we demonstrate that DTNI provides in-depth information on the mode of action of compounds, in particular key events and potential molecular initiating events. Furthermore, DTNI also discloses several unknown interactions which have to be verified experimentally. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

37. Detection of statistically significant network changes in complex biological networks.

Author: Mall, Raghvendra, Cerulo, Luigi, Bensmail, Halima, Iavarone, Antonio, and Ceccarelli, Michele
Subjects: *BIOLOGICAL networks, *MOLECULAR interactions, *GENE expression, *DISEASE progression, *HAMMING distance
Abstract: Background: Biological networks contribute effectively to unveil the complex structure of molecular interactions and to discover driver genes especially in cancer context. It can happen that due to gene mutations, as for example when cancer progresses, the gene expression network undergoes some amount of localized re-wiring. The ability to detect statistical relevant changes in the interaction patterns induced by the progression of the disease can lead to the discovery of novel relevant signatures. Several procedures have been recently proposed to detect sub-network differences in pairwise labeled weighted networks. Methods: In this paper, we propose an improvement over the state-of-the-art based on the Generalized Hamming Distance adopted for evaluating the topological difference between two networks and estimating its statistical significance. The proposed procedure exploits a more effective model selection criteria to generate p-values for statistical significance and is more efficient in terms of computational time and prediction accuracy than literature methods. Moreover, the structure of the proposed algorithm allows for a faster parallelized implementation. Results: In the case of dense random geometric networks the proposed approach is 10-15x faster and achieves 5-10% higher AUC, Precision/Recall, and Kappa value than the state-of-the-art. We also report the application of the method to dissect the difference between the regulatory networks of IDH-mutant versus IDH-wild-type glioma cancer. In such a case our method is able to identify some recently reported master regulators as well as novel important candidates. Conclusions: We show that our network differencing procedure can effectively and efficiently detect statistical significant network re-wirings in different conditions. When applied to detect the main differences between the networks of IDH-mutant and IDH-wild-type glioma tumors, it correctly selects sub-networks centered on important key regulators of these two different subtypes. In addition, its application highlights several novel candidates that cannot be detected by standard single network-based approaches. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

38. Computational inference of gene regulatory networks: Approaches, limitations and opportunities.

Author: Banf, Michael and Rhee, Seung Y.
Abstract: Gene regulatory networks lie at the core of cell function control. In E. coli and S. cerevisiae , the study of gene regulatory networks has led to the discovery of regulatory mechanisms responsible for the control of cell growth, differentiation and responses to environmental stimuli. In plants, computational rendering of gene regulatory networks is gaining momentum, thanks to the recent availability of high-quality genomes and transcriptomes and development of computational network inference approaches. Here, we review current techniques, challenges and trends in gene regulatory network inference and highlight challenges and opportunities for plant science. We provide plant-specific application examples to guide researchers in selecting methodologies that suit their particular research questions. Given the interdisciplinary nature of gene regulatory network inference, we tried to cater to both biologists and computer scientists to help them engage in a dialogue about concepts and caveats in network inference. Specifically, we discuss problems and opportunities in heterogeneous data integration for eukaryotic organisms and common caveats to be considered during network model evaluation. This article is part of a Special Issue entitled: Plant Gene Regulatory Mechanisms and Networks, edited by Dr. Erich Grotewold and Dr. Nathan Springer. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

39. Studying temporal dynamics of single cells: expression, lineage and regulatory networks.

Author: Pan X and Zhang X
Abstract: Learning how multicellular organs are developed from single cells to different cell types is a fundamental problem in biology. With the high-throughput scRNA-seq technology, computational methods have been developed to reveal the temporal dynamics of single cells from transcriptomic data, from phenomena on cell trajectories to the underlying mechanism that formed the trajectory. There are several distinct families of computational methods including Trajectory Inference (TI), Lineage Tracing (LT), and Gene Regulatory Network (GRN) Inference which are involved in such studies. This review summarizes these computational approaches which use scRNA-seq data to study cell differentiation and cell fate specification as well as the advantages and limitations of different methods. We further discuss how GRNs can potentially affect cell fate decisions and trajectory structures., Supplementary Information: The online version contains supplementary material available at 10.1007/s12551-023-01090-5., Competing Interests: Competing interestsThe authors have no relevant financial or non-financial interests to disclose., (© International Union for Pure and Applied Biophysics (IUPAB) and Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.)
Published: 2023
Full Text: View/download PDF

40. Gene regulatory network inference using PLS-based methods.

Author: Shun Guo, Qingshan Jiang, Lifei Chen, and Donghui Guo
Subjects: *GENE regulatory networks, *LEAST squares, *GENE expression, *TARGETED drug delivery, *DNA microarrays
Abstract: Background: Inferring the topology of gene regulatory networks (GRNs) from microarray gene expression data has many potential applications, such as identifying candidate drug targets and providing valuable insights into the biological processes. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of potential interactions. Results: We introduce an ensemble gene regulatory network inference method PLSNET, which decomposes the GRN inference problem with p genes into p subproblems and solves each of the subproblems by using Partial least squares (PLS) based feature selection algorithm. Then, a statistical technique is used to refine the predictions in our method. The proposed method was evaluated on the DREAM4 and DREAM5 benchmark datasets and achieved higher accuracy than the winners of those competitions and other state-of-the-art GRN inference methods. Conclusions: Superior accuracy achieved on different benchmark datasets, including both in silico and in vivo networks, shows that PLSNET reaches state-of-the-art performance. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

41. PFBNet: a priori-fused boosting method for gene regulatory network inference

Author: Lifei Chen, Shun Guo, Qingshan Jiang, and Dandan Che
Subjects: Boosting (machine learning), Computer science, Systems biology, 0206 medical engineering, Gene regulatory network, Inference, Gene Expression, 02 engineering and technology, Prior information fusion, Machine learning, computer.software_genre, lcsh:Computer applications to medicine. Medical informatics, Biochemistry, Boosting, 03 medical and health sciences, Gene regulatory network inference, Structural Biology, Escherichia coli, Gene Regulatory Networks, Molecular Biology, lcsh:QH301-705.5, 030304 developmental biology, 0303 health sciences, business.industry, Applied Mathematics, Methodology Article, Computational Biology, Computer Science Applications, ROC Curve, lcsh:Biology (General), Area Under Curve, A priori and a posteriori, lcsh:R858-859.7, Artificial intelligence, DNA microarray, business, computer, Time-series expression data, 020602 bioinformatics, Algorithms
Abstract: Background Inferring gene regulatory networks (GRNs) from gene expression data remains a challenge in system biology. In past decade, numerous methods have been developed for the inference of GRNs. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of potential interactions. Results We present a novel method, namely priori-fused boosting network inference method (PFBNet), to infer GRNs from time-series expression data by using the non-linear model of Boosting and the prior information (e.g., the knockout data) fusion scheme. Specifically, PFBNet first calculates the confidences of the regulation relationships using the boosting-based model, where the information about the accumulation impact of the gene expressions at previous time points is taken into account. Then, a newly defined strategy is applied to fuse the information from the prior data by elevating the confidences of the regulation relationships from the corresponding regulators. Conclusions The experiments on the benchmark datasets from DREAM challenge as well as the E.coli datasets show that PFBNet achieves significantly better performance than other state-of-the-art methods (Jump3, GEINE3-lag, HiDi, iRafNet and BiXGBoost).
Published: 2020

42. Integration of single-cell multi-omics for gene regulatory network inference

Author: Ricky Wai Tak Leung, Yaohua Hu, Jing Qin, Xinlin Hu, and Fanjie Wu
Subjects: lcsh:Biotechnology, Sequencing data, Biophysics, Gene regulatory network, Review Article, Computational biology, Biochemistry, Genome, Data type, Single-cell multi-omics integration, 03 medical and health sciences, 0302 clinical medicine, Gene regulatory network inference, Structural Biology, lcsh:TP248.13-248.65, Genetics, ComputingMethodologies_COMPUTERGRAPHICS, 030304 developmental biology, 0303 health sciences, Epigenome, Computer Science Applications, Single cell sequencing, Single-cell sequencing, 030220 oncology & carcinogenesis, Multi omics, Biotechnology
Abstract: Graphical abstract, The advancement of single-cell sequencing technology in recent years has provided an opportunity to reconstruct gene regulatory networks (GRNs) with the data from thousands of single cells in one sample. This uncovers regulatory interactions in cells and speeds up the discoveries of regulatory mechanisms in diseases and biological processes. Therefore, more methods have been proposed to reconstruct GRNs using single-cell sequencing data. In this review, we introduce technologies for sequencing single-cell genome, transcriptome, and epigenome. At the same time, we present an overview of current GRN reconstruction strategies utilizing different single-cell sequencing data. Bioinformatics tools were grouped by their input data type and mathematical principles for reader's convenience, and the fundamental mathematics inherent in each group will be discussed. Furthermore, the adaptabilities and limitations of these different methods will also be summarized and compared, with the hope to facilitate researchers recognizing the most suitable tools for them.
Published: 2020

43. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data

Author: Pratapa, Aditya, Jalihal, Amogh P., Law, Jeffrey N., Bharadwaj, Aditya, and Murali, T. M.
Subjects: Computer science, 0206 medical engineering, Gene regulatory network, Inference, Datasets as Topic, 02 engineering and technology, Biochemistry, Article, 03 medical and health sciences, Gene regulatory network inference, Gene Regulatory Networks, Molecular Biology, 030304 developmental biology, 0303 health sciences, Ground truth, Extramural, End user, Sequence Analysis, RNA, Cell Biology, Benchmarking, humanities, Single-Cell Analysis, Transcriptome, Algorithm, 020602 bioinformatics, Algorithms, Biotechnology
Abstract: We present a comprehensive evaluation of state-of-the-art algorithms for inferring gene regulatory networks (GRNs) from single-cell gene expression data. We develop a systematic framework called BEELINE for this purpose. We use synthetic networks with predictable cellular trajectories as well as curated Boolean models to serve as the ground truth for evaluating the accuracy of GRN inference algorithms. We develop a strategy to simulate single-cell gene expression data from these two types of networks that avoids the pitfalls of previously-used methods. We selected 12 representative GRN inference algorithms. We found that the accuracy of these methods (measured in terms of AUROC and AUPRC) was moderate, by and large, although the methods were better in recovering interactions in the synthetic networks than the Boolean models. Techniques that did not require pseudotime-ordered cells were more accurate, in general. The observation that the endpoints of many false positive edges were connected by paths of length two in the Boolean models suggested that indirect effects may be predominant in the outputs of the algorithms we tested. The predicted networks were considerably inconsistent with each other, indicating that combining GRN inference algorithms using ensembles is likely to be challenging. Based on the results, we present some recommendations to users of GRN inference algorithms, including suggestions on how to create simulated gene expression datasets for testing them. BEELINE, which is available at http://github.com/murali-group/BEELINE under an open-source license, will aid in the future development of GRN inference algorithms for single-cell transcriptomic data.
Published: 2020

44. Gene Regulatory Network Inference Using Ensembles of Predictors

Author: Peignier, Sergio, Sorin, Baptiste, Calevro, Federica, Biologie Fonctionnelle, Insectes et Interactions (BF2I), Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), and IEEE
Subjects: ComputingMethodologies_PATTERNRECOGNITION, Bioinformatics, Gene Regulatory Network Inference, [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], Ensemble Learning, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Abstract: International audience; In the machine learning field, the technique known as ensemble learning aims at combining different base learners in order to increase the quality and the robustness of the predictions. Indeed, this approach has widely been applied to tackle, with success, real world problems from different domains, including computational biology. Nevertheless, despite the potential of this technique, ensembles that combine results from different kinds of algorithms, have been understudied in the context of gene regulatory network inference. In this paper we used a genetic algorithm and frequent itemset mining, to study and design effective ensembles, to reverse-engineer gene regulatory networks, from high-throughput data. The methods proposed here, were evaluated and compared to well-established single and ensemble methods, on real and synthetic datasets. Results demonstrate the efficiency and the robustness of these new methods, advocating for their use as gene regulatory network inference tools.
Published: 2021

45. Bayesian estimation of the discrete coefficient of determination.

Author: Chen, Ting and Braga-Neto, Ulisses
Subjects: *GENE expression, *GENOMICS, *SIGNAL processing, *BAYESIAN analysis, *DISCRETE systems, *PARAMETER estimation
Abstract: The discrete coefficient of determination (CoD) measures the nonlinear interaction between discrete predictor and target variables and has had far-reaching applications in Genomic Signal Processing. Previous work has addressed the inference of the discrete CoD using classical parametric and nonparametric approaches. In this paper, we introduce a Bayesian framework for the inference of the discrete CoD. We derive analytically the optimal minimum mean-square error (MMSE) CoD estimator, as well as a CoD estimator based on the Optimal Bayesian Predictor (OBP). For the latter estimator, exact expressions for its bias, variance, and root-mean-square (RMS) are given. The accuracy of both Bayesian CoD estimators with non-informative and informative priors, under fixed or random parameters, is studied via analytical and numerical approaches. We also demonstrate the application of the proposed Bayesian approach in the inference of gene regulatory networks, using gene-expression data from a previously published study on metastatic melanoma. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

46. Evaluating the Reproducibility of Single-Cell Gene Regulatory Network Inference Algorithms

Author: Denis Thieffry, Laura Cantini, Yoonjee Kang, Institut de biologie de l'ENS Paris (UMR 8197/1024) (IBENS), Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Département de Biologie - ENS Paris, École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Université Paris sciences et lettres (PSL), ANR-20-CE45-0015,scMOmix,Méthodes pour l'intégration de données multi-omiques en cellule-unique(2020), Département de Biologie - ENS Paris, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Cantini, Laura, Méthodes pour l'intégration de données multi-omiques en cellule-unique - - scMOmix2020 - ANR-20-CE45-0015 - AAPG2020 - VALID, Institut de biologie de l'ENS Paris (IBENS), École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL), and Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Département de Biologie - ENS Paris
Subjects: lcsh:QH426-470, Computer science, [SDV]Life Sciences [q-bio], Inference, [MATH] Mathematics [math], Network theory, network theory, 03 medical and health sciences, Annotation, 0302 clinical medicine, Gene regulatory network inference, scRNA-seq, Genetics, [MATH]Mathematics [math], reproducibility, Genetics (clinical), 030304 developmental biology, Original Research, Reproducibility, 0303 health sciences, Intersection (set theory), single-cell, Thresholding, [STAT] Statistics [stat], [SDV] Life Sciences [q-bio], [STAT]Statistics [stat], lcsh:Genetics, biological networks, network inference, Benchmark (computing), Molecular Medicine, Algorithm, Functional genomics, transcriptome, 030217 neurology & neurosurgery, Biological network
Abstract: Networks are powerful tools to represent and investigate biological systems. The development of algorithms inferring regulatory interactions from functional genomics data has been an active area of research. With the advent of single-cell RNA-seq data (scRNA-seq), numerous methods specifically designed to take advantage of single-cell datasets have been proposed. However, published benchmarks on single-cell network inference are mostly based on simulated data. Once applied to real data, these benchmarks take into account only a small set of genes and only compare the inferred networks with an imposed ground-truth. Here, we benchmark six single-cell network inference methods based on their reproducibility, i.e., their ability to infer similar networks when applied to two independent datasets for the same biological condition. We tested each of these methods on real data from three biological conditions: human retina, T-cells in colorectal cancer, and human hematopoiesis. Once taking into account networks with up to 100,000 links, GENIE3 results to be the most reproducible algorithm and, together with GRNBoost2, show higher intersection with ground-truth biological interactions. These results are independent from the single-cell sequencing platform, the cell type annotation system and the number of cells constituting the dataset. Finally, GRNBoost2 and CLR show more reproducible performance once a more stringent thresholding is applied to the networks (1,000–100 links). In order to ensure the reproducibility and ease extensions of this benchmark study, we implemented all the analyses in scNET, a Jupyter notebook available at https://github.com/ComputationalSystemsBiology/scNET.
Published: 2021

47. Inferring the experimental design for accurate gene regulatory network inference

Author: Deniz Seçilmiş, Sven Nelander, Erik L. L. Sonnhammer, and Thomas Hillerton
Subjects: Statistics and Probability, AcademicSubjects/SCI01060, Bioinformatics and Systems Biology, Computer science, Test data generation, Systems Biology, Design of experiments, SIGNAL (programming language), Gene regulatory network, Inference, Bioinformatik och systembiologi, computer.software_genre, Original Papers, Biochemistry, Synthetic data, Computer Science Applications, Computational Mathematics, Computational Theory and Mathematics, Gene regulatory network inference, Noise (video), Data mining, Molecular Biology, computer
Abstract: Motivation Accurate inference of gene regulatory interactions is of importance for understanding the mechanisms of underlying biological processes. For gene expression data gathered from targeted perturbations, gene regulatory network (GRN) inference methods that use the perturbation design are the top performing methods. However, the connection between the perturbation design and gene expression can be obfuscated due to problems, such as experimental noise or off-target effects, limiting the methods’ ability to reconstruct the true GRN. Results In this study, we propose an algorithm, IDEMAX, to infer the effective perturbation design from gene expression data in order to eliminate the potential risk of fitting a disconnected perturbation design to gene expression. We applied IDEMAX to synthetic data from two different data generation tools, GeneNetWeaver and GeneSPIDER, and assessed its effect on the experiment design matrix as well as the accuracy of the GRN inference, followed by application to a real dataset. The results show that our approach consistently improves the accuracy of GRN inference compared to using the intended perturbation design when much of the signal is hidden by noise, which is often the case for real data. Availability and implementation https://bitbucket.org/sonnhammergrni/idemax. Supplementary information Supplementary data are available at Bioinformatics online.
Published: 2021

48. Perturbation-based gene regulatory network inference to unravel oncogenic mechanisms

Author: Holger Weishaupt, Torbjörn E. M. Nordling, Andreas Tjärnberg, Daniel Morgan, Matthew Studham, Fredrik J. Swartling, and Erik L. L. Sonnhammer
Subjects: 0301 basic medicine, Microarrays, Carcinogenesis, Computer science, Systems biology, Science, Cell, Gene regulatory network, Inference, Computational biology, Bioinformatik och systembiologi, Article, Culture Media, Serum-Free, Gene regulatory networks, 03 medical and health sciences, 0302 clinical medicine, RNA interference, Cell Line, Tumor, Gene regulatory network inference, medicine, Humans, RNA, Small Interfering, Gene, Multidisciplinary, Bioinformatics and Systems Biology, Brain Neoplasms, Null model, Glioma, Regulatory networks, 030104 developmental biology, medicine.anatomical_structure, RNAi, Gene Knockdown Techniques, Carcinoma, Squamous Cell, Medicine, RNA Interference, Monte Carlo Method, 030217 neurology & neurosurgery, Genes, Neoplasm
Abstract: The gene regulatory network (GRN) of human cells encodes mechanisms to ensure proper functioning. However, if this GRN is dysregulated, the cell may enter into a disease state such as cancer. Understanding the GRN as a system can therefore help identify novel mechanisms underlying disease, which can lead to new therapies. Reliable inference of GRNs is however still a major challenge in systems biology.To deduce regulatory interactions relevant to cancer, we applied a recent computational inference framework to data from perturbation experiments in squamous carcinoma cell line A431. GRNs were inferred using several methods, and the false discovery rate was controlled by the NestBoot framework. We developed a novel approach to assess the predictiveness of inferred GRNs against validation data, despite the lack of a gold standard. The best GRN was significantly more predictive than the null model, both in crossvalidated benchmarks and for an independent dataset of the same genes under a different perturbation design. It agrees with many known links, in addition to predicting a large number of novel interactions from which a subset was experimentally validated. The inferred GRN captures regulatory interactions central to cancer-relevant processes and thus provides mechanistic insights that are useful for future cancer research.Data available at GSE125958Inferred GRNs and inference statistics available at https://dcolin.shinyapps.io/CancerGRN/ Software available at https://bitbucket.org/sonnhammergrni/genespider/src/BFECV/Author SummaryCancer is the second most common cause of death globally, and although cancer treatments have improved in recent years, we need to understand how regulatory mechanisms are altered in cancer to combat the disease efficiently. By applying gene perturbations and inference of gene regulatory networks to 40 genes known or suspected to have a role in cancer due to interactions with the oncogene MYC, we deduce their underlying regulatory interactions. Using a recent computational framework for inference together with a novel method for cross validation, we infer a reliable regulatory model of this system in a completely data driven manner, not reliant on literature or priors. The novel interactions add to the understanding of the progressive oncogenic regulatory process and may provide new targets for therapy.
Published: 2020

49. A novel Boolean network inference strategy to model early hematopoiesis aging.

Author: Hérault L, Poplineau M, Duprez E, and Remy É
Abstract: Hematopoietic stem cell (HSC) aging is a multifactorial event leading to changes in HSC properties and functions, which are intrinsically coordinated and affect the early hematopoiesis. To better understand the mechanisms and factors controlling these changes, we developed an original strategy to construct a Boolean model of HSC differentiation. Based on our previous scRNA-seq data, we exhaustively characterized active transcription modules or regulons along the differentiation trajectory and constructed an influence graph between 15 selected components involved in the dynamics of the process. Then we defined dynamical constraints between observed cellular states along the trajectory and using answer set programming with in silico perturbation analysis, we obtained a Boolean model explaining the early priming of HSCs. Finally, perturbations of the model based on age-related changes revealed important deregulations, such as the overactivation of Egr1 and Junb or the loss of Cebpa activation by Gata2. These new regulatory mechanisms were found to be relevant for the myeloid bias of aged HSC and explain the decreased transcriptional priming of HSCs to all mature cell types except megakaryocytes., Competing Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (© 2022 The Authors.)
Published: 2022
Full Text: View/download PDF

50. Inferring causal gene regulatory network via GreyNet: From dynamic grey association to causation.

Author: Chen G and Liu ZP
Abstract: Gene regulatory network (GRN) provides abundant information on gene interactions, which contributes to demonstrating pathology, predicting clinical outcomes, and identifying drug targets. Existing high-throughput experiments provide rich time-series gene expression data to reconstruct the GRN to further gain insights into the mechanism of organisms responding to external stimuli. Numerous machine-learning methods have been proposed to infer gene regulatory networks. Nevertheless, machine learning, especially deep learning, is generally a "black box," which lacks interpretability. The causality has not been well recognized in GRN inference procedures. In this article, we introduce grey theory integrated with the adaptive sliding window technique to flexibly capture instant gene-gene interactions in the uncertain regulatory system. Then, we incorporate generalized multivariate Granger causality regression methods to transform the dynamic grey association into causation to generate directional regulatory links. We evaluate our model on the DREAM4 in silico benchmark dataset and real-world hepatocellular carcinoma (HCC) time-series data. We achieved competitive results on the DREAM4 compared with other state-of-the-art algorithms and gained meaningful GRN structure on HCC data respectively., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2022 Chen and Liu.)
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Journal

Database

Publisher

138 results on '"Gene regulatory network inference"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources