1,062 results
Search Results
2. Ten simple rules for collaboratively writing a multi-authored paper.
- Author
-
Frassl, Marieke A., Hamilton, David P., Denfeld, Blaize A., de Eyto, Elvira, Hampton, Stephanie E., Keller, Philipp S., Sharma, Sapna, Lewis, Abigail S. L., Weyhenmeyer, Gesa A., O’Reilly, Catherine M., Lofton, Mary E., and Catalán, Núria
- Subjects
AUTHORSHIP collaboration ,COLLABORATIVE learning ,ACKNOWLEDGEMENTS (Academic dissertations) ,INFORMATION resources management ,GROUP work in research - Abstract
An editorial is presented which discusses the collaborative writing with multiple authors which has additional challenges including varied levels of engagement of coauthors, provision of fair credit through authorship or acknowledgements and acceptance of diversity of work styles. Also discussed are the 10 simple rules for collaboratively writing a multi-authored paper which include to build a writing team wisely; to create a data management plan and to jointly decide on authorship guidelines.
- Published
- 2018
- Full Text
- View/download PDF
3. Women are underrepresented in computational biology: An analysis of the scholarly literature in biology, computer science and computational biology.
- Author
-
Bonham, Kevin S. and Stefan, Melanie I.
- Subjects
STEM education ,COMPUTATIONAL biology ,BIBLIOMETRICS ,SCIENCE publishing ,SCIENCE & state - Abstract
While women are generally underrepresented in STEM fields, there are noticeable differences between fields. For instance, the gender ratio in biology is more balanced than in computer science. We were interested in how this difference is reflected in the interdisciplinary field of computational/quantitative biology. To this end, we examined the proportion of female authors in publications from the PubMed and arXiv databases. There are fewer female authors on research papers in computational biology, as compared to biology in general. This is true across authorship position, year, and journal impact factor. A comparison with arXiv shows that quantitative biology papers have a higher ratio of female authors than computer science papers, placing computational biology in between its two parent fields in terms of gender representation. Both in biology and in computational biology, a female last author increases the probability of other authors on the paper being female, pointing to a potential role of female PIs in influencing the gender balance. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
4. Details in the evaluation of circular RNA detection tools: Reply to Chen and Chuang.
- Author
-
Zeng, Xiangxiang, Lin, Wei, Guo, Maozu, and Zou, Quan
- Subjects
CIRCULAR RNA ,BIG data ,TOXINS ,PLASMIDS ,DATABASES - Abstract
In their comment, Chen and Chuang [] pointed out several weak points of our recent paper []. Here we respond in detail to clarify the dataset we used in our work. We also discuss the three confounding factors they listed in their comment. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
5. Generation of Binary Tree-Child phylogenetic networks.
- Author
-
Cardona, Gabriel, Pons, Joan Carles, and Scornavacca, Celine
- Subjects
BOTANY ,PHYSICAL sciences ,BINARY number system ,LIFE sciences ,PLANT anatomy ,GRAPH theory - Abstract
Phylogenetic networks generalize phylogenetic trees by allowing the modelization of events of reticulate evolution. Among the different kinds of phylogenetic networks that have been proposed in the literature, the subclass of binary tree-child networks is one of the most studied ones. However, very little is known about the combinatorial structure of these networks. In this paper we address the problem of generating all possible binary tree-child (BTC) networks with a given number of leaves in an efficient way via reduction/augmentation operations that extend and generalize analogous operations for phylogenetic trees, and are biologically relevant. Since our solution is recursive, this also provides us with a recurrence relation giving an upper bound on the number of such networks. We also show how the operations introduced in this paper can be employed to extend the evolutive history of a set of sequences, represented by a BTC network, to include a new sequence. An implementation in python of the algorithms described in this paper, along with some computational experiments, can be downloaded from . [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
6. Ten simple rules for developing good reading habits during graduate school and beyond.
- Author
-
Méndez, Marcos
- Subjects
READING ,HABIT formation ,COMPREHENSION ,HISTORY ,PERIODICALS ,BIBLIOGRAPHICAL citations - Abstract
The author talks about several rules that a person can follow to develop good reading habits in graduate school and beyond. Topics discussed include the importance of developing the habit of reading on a daily basis; the need to develop comprehension skills; and the need to study the history of one's discipline. Also mentioned are the importance of creating a list of relevant journals, the need to read books, and the benefits of using a reference manager.
- Published
- 2018
- Full Text
- View/download PDF
7. Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming.
- Author
-
Wu, Stephen Gang, Wang, Yuxuan, Jiang, Wu, Oyetunde, Tolutola, Yao, Ruilian, Zhang, Xuehong, Shimizu, Kazuyuki, Tang, Yinjie J., and Bao, Forrest Sheng
- Subjects
METABOLIC flux analysis ,SUPPORT vector machines ,CELL metabolism ,MACHINE learning ,STOICHIOMETRY - Abstract
13 C metabolic flux analysis (13 C-MFA) has been widely used to measure in vivo enzyme reaction rates (i.e., metabolic flux) in microorganisms. Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification. In this paper, we present a web-based platform MFlux () that predicts the bacterial central metabolism via machine learning, leveraging data from approximately 10013 C-MFA papers on heterotrophic bacterial metabolisms. Three machine learning methods, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Decision Tree, were employed to study the sophisticated relationship between influential factors and metabolic fluxes. We performed a grid search of the best parameter set for each algorithm and verified their performance through 10-fold cross validations. SVM yields the highest accuracy among all three algorithms. Further, we employed quadratic programming to adjust flux profiles to satisfy stoichiometric constraints. Multiple case studies have shown that MFlux can reasonably predict fluxomes as a function of bacterial species, substrate types, growth rate, oxygen conditions, and cultivation methods. Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models. This problem can be resolved after more papers on13 C-MFA are published for non-model species. [ABSTRACT FROM AUTHOR]- Published
- 2016
- Full Text
- View/download PDF
8. A modeling study of budding yeast colony formation and its relationship to budding pattern and aging.
- Author
-
Wang, Yanli, Lo, Wing-Cheong, and Chou, Ching-Shan
- Subjects
YEAST fungi genetics ,BUDDING (Zoology) ,ELECTRIC properties of cells ,HAPLOIDY ,DIPLOIDY - Abstract
Budding yeast, which undergoes polarized growth during budding and mating, has been a useful model system to study cell polarization. Bud sites are selected differently in haploid and diploid yeast cells: haploid cells bud in an axial manner, while diploid cells bud in a bipolar manner. While previous studies have been focused on the molecular details of the bud site selection and polarity establishment, not much is known about how different budding patterns give rise to different functions at the population level. In this paper, we develop a two-dimensional agent-based model to study budding yeast colonies with cell-type specific biological processes, such as budding, mating, mating type switch, consumption of nutrients, and cell death. The model demonstrates that the axial budding pattern enhances mating probability at an early stage and the bipolar budding pattern improves colony development under nutrient limitation. Our results suggest that the frequency of mating type switch might control the trade-off between diploidization and inbreeding. The effect of cellular aging is also studied through our model. Based on the simulations, colonies initiated by an aged haploid cell show declined mating probability at an early stage and recover as the rejuvenated offsprings become the majority. Colonies initiated with aged diploid cells do not show disadvantage in colony expansion possibly due to the fact that young cells contribute the most to colony expansion. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
9. Ten simple rules to consider regarding preprint submission.
- Author
-
Bourne, Philip E., Polka, Jessica K., Vale, Ronald D., and Kiley, Robert
- Subjects
PREPRINTS ,DATA mining ,LICENSES ,COPYRIGHT - Abstract
The article discusses rules for considering regarding preprint submission of scientific work in journals. Topics include analysis of published papers undertaken by cell biologist Stephen Royle for estimating the average time from first submission to publication, data mining of the written content for making it better utilizing the knowledge, and encouraging authors to licenses and formats for facilitating reuse while retaining copyright to their work.
- Published
- 2017
- Full Text
- View/download PDF
10. Personalized glucose forecasting for type 2 diabetes using data assimilation.
- Author
-
Albers, David J., Levine, Matthew, Gluckman, Bruce, Ginsberg, Henry, Hripcsak, George, and Mamykina, Lena
- Subjects
BLOOD sugar monitoring ,TYPE 2 diabetes ,QUALITY of life ,GLYCEMIC control ,BAYESIAN analysis ,GAUSSIAN processes - Abstract
Type 2 diabetes leads to premature death and reduced quality of life for 8% of Americans. Nutrition management is critical to maintaining glycemic control, yet it is difficult to achieve due to the high individual differences in glycemic response to nutrition. Anticipating glycemic impact of different meals can be challenging not only for individuals with diabetes, but also for expert diabetes educators. Personalized computational models that can accurately forecast an impact of a given meal on an individual’s blood glucose levels can serve as the engine for a new generation of decision support tools for individuals with diabetes. However, to be useful in practice, these computational engines need to generate accurate forecasts based on limited datasets consistent with typical self-monitoring practices of individuals with type 2 diabetes. This paper uses three forecasting machines: (i) data assimilation, a technique borrowed from atmospheric physics and engineering that uses Bayesian modeling to infuse data with human knowledge represented in a mechanistic model, to generate real-time, personalized, adaptable glucose forecasts; (ii) model averaging of data assimilation output; and (iii) dynamical Gaussian process model regression. The proposed data assimilation machine, the primary focus of the paper, uses a modified dual unscented Kalman filter to estimate states and parameters, personalizing the mechanistic models. Model selection is used to make a personalized model selection for the individual and their measurement characteristics. The data assimilation forecasts are empirically evaluated against actual postprandial glucose measurements captured by individuals with type 2 diabetes, and against predictions generated by experienced diabetes educators after reviewing a set of historical nutritional records and glucose measurements for the same individual. The evaluation suggests that the data assimilation forecasts compare well with specific glucose measurements and match or exceed in accuracy expert forecasts. We conclude by examining ways to present predictions as forecast-derived range quantities and evaluate the comparative advantages of these ranges. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
11. A quick guide for using Microsoft OneNote as an electronic laboratory notebook.
- Author
-
Guerrero, Santiago, López-Cortés, Andrés, García-Cárdenas, Jennyfer M., Saa, Pablo, Indacochea, Alberto, Armendáriz-Castillo, Isaac, Zambrano, Ana Karina, Yumiceba, Verónica, Pérez-Villa, Andy, Guevara-Ramírez, Patricia, Moscoso-Zea, Oswaldo, Paredes, Joel, Leone, Paola E., and Paz-y-Miño, César
- Subjects
DATA recorders & recording ,MEDICAL research ,CLINICAL trials ,WORKFLOW ,RESEARCH institutes - Abstract
Scientific data recording and reporting systems are of a great interest for endorsing reproducibility and transparency practices among the scientific community. Current research generates large datasets that can no longer be documented using paper lab notebooks (PLNs). In this regard, electronic laboratory notebooks (ELNs) could be a promising solution to replace PLNs and promote scientific reproducibility and transparency. We previously analyzed five ELNs and performed two survey-based studies to implement an ELN in a biomedical research institute. Among the ELNs tested, we found that Microsoft OneNote presents numerous features related to ELN best functionalities. In addition, both surveyed groups preferred OneNote over a scientifically designed ELN (PerkinElmer Elements). However, OneNote remains a general note-taking application and has not been designed for scientific purposes. We therefore provide a quick guide to adapt OneNote to an ELN workflow that can also be adjusted to other nonscientific ELNs. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
12. Even a good influenza forecasting model can benefit from internet-based nowcasts, but those benefits are limited.
- Author
-
Osthus, Dave, Daughton, Ashlynn R., and Priedhorsky, Reid
- Subjects
INFLUENZA ,RESPIRATORY infections ,PUBLIC health ,MATHEMATICAL models of forecasting - Abstract
The ability to produce timely and accurate flu forecasts in the United States can significantly impact public health. Augmenting forecasts with internet data has shown promise for improving forecast accuracy and timeliness in controlled settings, but results in practice are less convincing, as models augmented with internet data have not consistently outperformed models without internet data. In this paper, we perform a controlled experiment, taking into account data backfill, to improve clarity on the benefits and limitations of augmenting an already good flu forecasting model with internet-based nowcasts. Our results show that a good flu forecasting model can benefit from the augmentation of internet-based nowcasts in practice for all considered public health-relevant forecasting targets. The degree of forecast improvement due to nowcasting, however, is uneven across forecasting targets, with short-term forecasting targets seeing the largest improvements and seasonal targets such as the peak timing and intensity seeing relatively marginal improvements. The uneven forecasting improvements across targets hold even when “perfect” nowcasts are used. These findings suggest that further improvements to flu forecasting, particularly seasonal targets, will need to derive from other, non-nowcasting approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
13. Ten Simple Rules for a Bioinformatics Journal Club.
- Author
-
Lonsdale, Andrew, Sietsma Penington, Jocelyn, Rice, Timothy, Walker, Michael, and Dashnow, Harriet
- Subjects
BIOINFORMATICS ,INFORMATION science ,COMPUTATIONAL biology ,CLUBS ,SCIENTIFIC literature ,SOCIETIES - Abstract
The article outlines the rules for a bioinformatics journal club which, according to the authors, is a great way to take in the scientific literature, keep up with developments in their field, and hone their communication and analytical skills. The rules include holding a journal club at eight in the morning, finding good articles for discussion, and expanding the roster of leaders as people join the journal club.
- Published
- 2016
- Full Text
- View/download PDF
14. Enzyme sequestration by the substrate: An analysis in the deterministic and stochastic domains.
- Author
-
Petrides, Andreas and Vinnicombe, Glenn
- Subjects
PHOSPHORYLATION ,PHOSPHATASES ,KINASES ,ENZYMES ,SEQUESTRATION (Chemistry) - Abstract
This paper is concerned with the potential multistability of protein concentrations in the cell. That is, situations where one, or a family of, proteins may sit at one of two or more different steady state concentrations in otherwise identical cells, and in spite of them being in the same environment. For models of multisite protein phosphorylation for example, in the presence of excess substrate, it has been shown that the achievable number of stable steady states can increase linearly with the number of phosphosites available. In this paper, we analyse the consequences of adding enzyme docking to these and similar models, with the resultant sequestration of phosphatase and kinase by the fully unphosphorylated and by the fully phosphorylated substrates respectively. In the large molecule numbers limit, where deterministic analysis is applicable, we prove that there are always values for these rates of sequestration which, when exceeded, limit the extent of multistability. For the models considered here, these numbers are much smaller than the affinity of the enzymes to the substrate when it is in a modifiable state. As substrate enzyme-sequestration is increased, we further prove that the number of steady states will inevitably be reduced to one. For smaller molecule numbers a stochastic analysis is more appropriate, where multistability in the large molecule numbers limit can manifest itself as multimodality of the probability distribution; the system spending periods of time in the vicinity of one mode before jumping to another. Here, we find that substrate enzyme sequestration can induce bimodality even in systems where only a single steady state can exist at large numbers. To facilitate this analysis, we develop a weakly chained diagonally dominant M-matrix formulation of the Chemical Master Equation, allowing greater insights in the way particular mechanisms, like enzyme sequestration, can shape probability distributions and therefore exhibit different behaviour across different regimes. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
15. Forecasting Human African Trypanosomiasis Prevalences from Population Screening Data Using Continuous Time Models.
- Author
-
De Vries, Harwin, Wagelmans, Albert P. M., Hasker, Epco, Lumbala, Crispin, Lutumba, Pascal, De Vlas, Sake J., and Klundert, Joris Van De
- Subjects
AFRICAN trypanosomiasis ,MEDICAL screening ,DISEASE prevalence ,DISEASE progression ,EPIDEMICS ,DIAGNOSIS - Abstract
To eliminate and eradicate gambiense human African trypanosomiasis (HAT), maximizing the effectiveness of active case finding is of key importance. The progression of the epidemic is largely influenced by the planning of these operations. This paper introduces and analyzes five models for predicting HAT prevalence in a given village based on past observed prevalence levels and past screening activities in that village. Based on the quality of prevalence level predictions in 143 villages in Kwamouth (DRC), and based on the theoretical foundation underlying the models, we consider variants of the Logistic Model—a model inspired by the SIS epidemic model—to be most suitable for predicting HAT prevalence levels. Furthermore, we demonstrate the applicability of this model to predict the effects of planning policies for screening operations. Our analysis yields an analytical expression for the screening frequency required to reach eradication (zero prevalence) and a simple approach for determining the frequency required to reach elimination within a given time frame (one case per 10000). Furthermore, the model predictions suggest that annual screening is only expected to lead to eradication if at least half of the cases are detected during the screening rounds. This paper extends knowledge on control strategies for HAT and serves as a basis for further modeling and optimization studies. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
16. LOTUS: A single- and multitask machine learning algorithm for the prediction of cancer driver genes.
- Author
-
Collier, Olivier, Stoven, Véronique, and Vert, Jean-Philippe
- Subjects
CANCER genes ,MACHINE learning ,LEARNING strategies ,P53 antioncogene ,PROTEIN-protein interactions ,COMPUTATIONAL biology ,TUMOR suppressor genes - Abstract
Cancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the acquisition of important functions in tumors, providing a selective growth advantage, allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to identify these driver genes, both for the fundamental understanding of cancer and to help finding new therapeutic targets or biomarkers. Although the most frequently mutated driver genes have been identified, it is believed that many more remain to be discovered, particularly for driver genes specific to some cancer types. In this paper, we propose a new computational method called LOTUS to predict new driver genes. LOTUS is a machine-learning based approach which allows to integrate various types of data in a versatile manner, including information about gene mutations and protein-protein interactions. In addition, LOTUS can predict cancer driver genes in a pan-cancer setting as well as for specific cancer types, using a multitask learning strategy to share information across cancer types. We empirically show that LOTUS outperforms five other state-of-the-art driver gene prediction methods, both in terms of intrinsic consistency and prediction accuracy, and provide predictions of new cancer genes across many cancer types. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
17. Optimizing spatial allocation of seasonal influenza vaccine under temporal constraints.
- Author
-
Venkatramanan, Srinivasan, Chen, Jiangzhuo, Fadikar, Arindam, Gupta, Sandeep, Higdon, Dave, Lewis, Bryan, Marathe, Madhav, Mortveit, Henning, and Vullikanti, Anil
- Subjects
SEASONAL influenza ,INFLUENZA vaccines ,FLU vaccine efficacy ,HEALTH policy - Abstract
Prophylactic interventions such as vaccine allocation are some of the most effective public health policy planning tools. The supply of vaccines, however, is limited and an important challenge is to optimally allocate the vaccines to minimize epidemic impact. This resource allocation question (which we refer to as VID) has multiple dimensions: when, where, to whom, etc. Most of the existing literature in this topic deals with the latter (to whom), proposing policies that prioritize individuals by age and disease risk. However, since seasonal influenza spread has a typical spatial trend, and due to the temporal constraints enforced by the availability schedule, the when and where problems become equally, if not more, relevant. In this paper, we study the VID problem in the context of seasonal influenza spread in the United States. We develop a national scale metapopulation model for influenza that integrates both short and long distance human mobility, along with realistic data on vaccine uptake. We also design GA, a greedy algorithm for allocating the vaccine supply at the state level under temporal constraints and show that such a strategy improves over the current baseline of pro-rata allocation, and the improvement is more pronounced for higher vaccine efficacy and moderate flu season intensity. Further, the resulting strategy resembles a ring vaccination applied spatiallyacross the US. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
18. Weak coupling between intracellular feedback loops explains dissociation of clock gene dynamics.
- Author
-
Schmal, Christoph, Ono, Daisuke, Myung, Jihwan, Pett, J. Patrick, Honma, Sato, Honma, Ken-Ichi, Herzel, Hanspeter, and Tokuda, Isao T.
- Subjects
MOLECULAR clock ,CIRCADIAN rhythms ,GENE expression ,PHYSICAL sciences ,CYTOLOGY - Abstract
Circadian rhythms are generated by interlocked transcriptional-translational negative feedback loops (TTFLs), the molecular process implemented within a cell. The contributions, weighting and balancing between the multiple feedback loops remain debated. Dissociated, free-running dynamics in the expression of distinct clock genes has been described in recent experimental studies that applied various perturbations such as slice preparations, light pulses, jet-lag, and culture medium exchange. In this paper, we provide evidence that this “presumably transient” dissociation of circadian gene expression oscillations may occur at the single-cell level. Conceptual and detailed mechanistic mathematical modeling suggests that such dissociation is due to a weak interaction between multiple feedback loops present within a single cell. The dissociable loops provide insights into underlying mechanisms and general design principles of the molecular circadian clock. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
19. Accuracy of Answers to Cell Lineage Questions Depends on Single-Cell Genomics Data Quality and Quantity.
- Author
-
Spiro, Adam and Shapiro, Ehud
- Subjects
CELL lines ,GENOMICS ,LINEAGE ,PHYLOGENY ,ALLELES - Abstract
Advances in single-cell (SC) genomics enable commensurate improvements in methods for uncovering lineage relations among individual cells, as determined by phylogenetic analysis of the somatic mutations harbored by each cell. Theoretically, complete and accurate knowledge of the genome of each cell of an individual can produce an extremely accurate cell lineage tree of that individual. However, the reality of SC genomics is that such complete and accurate knowledge would be wanting, in quality and in quantity, for the foreseeable future. In this paper we offer a framework for systematically exploring the feasibility of answering cell lineage questions based on SC somatic mutational analysis, as a function of SC genomics data quality and quantity. We take into consideration the current limitations of SC genomics in terms of mutation data quality, most notably amplification bias and allele dropouts (ADO), as well as cost, which puts practical limits on mutation data quantity obtained from each cell as well as on cell sample density. We do so by generating in silico cell lineage trees using a dedicated formal language, eSTG, and show how the ability to answer correctly a cell lineage question depends on the quality and quantity of the SC mutation data. The presented framework can serve as a baseline for the potential of current SC genomics to unravel cell lineage dynamics, as well as the potential contributions of future advancement, both biochemical and computational, for the task. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
20. Fast Bayesian Inference of Copy Number Variants using Hidden Markov Models with Wavelet Compression.
- Author
-
Wiedenhoeft, John, Brugel, Eric, and Schliep, Alexander
- Subjects
MARKOV processes ,WAVELETS (Mathematics) ,FORWARD-backward algorithm ,CHROMOSOME fragments ,BAYESIAN analysis - Abstract
By integrating Haar wavelets with Hidden Markov Models, we achieve drastically reduced running times for Bayesian inference using Forward-Backward Gibbs sampling. We show that this improves detection of genomic copy number variants (CNV) in array CGH experiments compared to the state-of-the-art, including standard Gibbs sampling. The method concentrates computational effort on chromosomal segments which are difficult to call, by dynamically and adaptively recomputing consecutive blocks of observations likely to share a copy number. This makes routine diagnostic use and re-analysis of legacy data collections feasible; to this end, we also propose an effective automatic prior. An open source software implementation of our method is available at (DOI: ). This paper was selected for oral presentation at RECOMB 2016, and an abstract is published in the conference proceedings. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
21. Identifying nonlinear dynamical systems via generative recurrent neural networks with applications to fMRI.
- Author
-
Koppe, Georgia, Toutounji, Hazem, Kirsch, Peter, Lis, Stefanie, and Durstewitz, Daniel
- Subjects
RECURRENT neural networks ,NONLINEAR dynamical systems ,LINEAR dynamical systems ,FUNCTIONAL magnetic resonance imaging ,DYNAMICAL systems - Abstract
A major tenet in theoretical neuroscience is that cognitive and behavioral processes are ultimately implemented in terms of the neural system dynamics. Accordingly, a major aim for the analysis of neurophysiological measurements should lie in the identification of the computational dynamics underlying task processing. Here we advance a state space model (SSM) based on generative piecewise-linear recurrent neural networks (PLRNN) to assess dynamics from neuroimaging data. In contrast to many other nonlinear time series models which have been proposed for reconstructing latent dynamics, our model is easily interpretable in neural terms, amenable to systematic dynamical systems analysis of the resulting set of equations, and can straightforwardly be transformed into an equivalent continuous-time dynamical system. The major contributions of this paper are the introduction of a new observation model suitable for functional magnetic resonance imaging (fMRI) coupled to the latent PLRNN, an efficient stepwise training procedure that forces the latent model to capture the ‘true’ underlying dynamics rather than just fitting (or predicting) the observations, and of an empirical measure based on the Kullback-Leibler divergence to evaluate from empirical time series how well this goal of approximating the underlying dynamics has been achieved. We validate and illustrate the power of our approach on simulated ‘ground-truth’ dynamical systems as well as on experimental fMRI time series, and demonstrate that the learnt dynamics harbors task-related nonlinear structure that a linear dynamical model fails to capture. Given that fMRI is one of the most common techniques for measuring brain activity non-invasively in human subjects, this approach may provide a novel step toward analyzing aberrant (nonlinear) dynamics for clinical assessment or neuroscientific research. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
22. Transient crosslinking kinetics optimize gene cluster interactions.
- Author
-
Walker, Benjamin, Taylor, Dane, Lawrimore, Josh, Hult, Caitlin, Adalsteinsson, David, Bloom, Kerry, and Forest, M. Gregory
- Subjects
GENE clusters ,CHROMOSOME structure ,COMPUTATIONAL biology ,RIBOSOMAL DNA - Abstract
Our understanding of how chromosomes structurally organize and dynamically interact has been revolutionized through the lens of long-chain polymer physics. Major protein contributors to chromosome structure and dynamics are condensin and cohesin that stochastically generate loops within and between chains, and entrap proximal strands of sister chromatids. In this paper, we explore the ability of transient, protein-mediated, gene-gene crosslinks to induce clusters of genes, thereby dynamic architecture, within the highly repeated ribosomal DNA that comprises the nucleolus of budding yeast. We implement three approaches: live cell microscopy; computational modeling of the full genome during G1 in budding yeast, exploring four decades of timescales for transient crosslinks between 5kbp domains (genes) in the nucleolus on Chromosome XII; and, temporal network models with automated community (cluster) detection algorithms applied to the full range of 4D modeling datasets. The data analysis tools detect and track gene clusters, their size, number, persistence time, and their plasticity (deformation). Of biological significance, our analysis reveals an optimal mean crosslink lifetime that promotes pairwise and cluster gene interactions through “flexible” clustering. In this state, large gene clusters self-assemble yet frequently interact (merge and separate), marked by gene exchanges between clusters, which in turn maximizes global gene interactions in the nucleolus. This regime stands between two limiting cases each with far less global gene interactions: with shorter crosslink lifetimes, “rigid” clustering emerges with clusters that interact infrequently; with longer crosslink lifetimes, there is a dissolution of clusters. These observations are compared with imaging experiments on a normal yeast strain and two condensin-modified mutant cell strains. We apply the same image analysis pipeline to the experimental and simulated datasets, providing support for the modeling predictions. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
23. Per-sample immunoglobulin germline inference from B cell receptor deep sequencing data.
- Author
-
Ralph, Duncan K. and IVMatsen, Frederick A.
- Subjects
B cell receptors ,IMMUNOGLOBULIN genes ,B cells ,ALLELES - Abstract
The collection of immunoglobulin genes in an individual’s germline, which gives rise to B cell receptors via recombination, is known to vary significantly across individuals. In humans, for example, each individual has only a fraction of the several hundred known V alleles. Furthermore, the currently-accepted set of known V alleles is both incomplete (particularly for non-European samples), and contains a significant number of spurious alleles. The resulting uncertainty as to which immunoglobulin alleles are present in any given sample results in inaccurate B cell receptor sequence annotations, and in particular inaccurate inferred naive ancestors. In this paper we first show that the currently widespread practice of aligning each sequence to its closest match in the full set of IMGT alleles results in a very large number of spurious alleles that are not in the sample’s true set of germline V alleles. We then describe a new method for inferring each individual’s germline gene set from deep sequencing data, and show that it improves upon existing methods by making a detailed comparison on a variety of simulated and real data samples. This new method has been integrated into the partis annotation and clonal family inference package, available at , and is run by default without affecting overall run time. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
24. Ensemble of decision tree reveals potential miRNA-disease associations.
- Author
-
Chen, Xing, Zhu, Chi-Chi, and Yin, Jun
- Subjects
DIMENSION reduction (Statistics) ,DECISION trees ,RENAL cancer ,THERAPEUTICS ,BREAST tumors ,MICRORNA - Abstract
In recent years, increasing associations between microRNAs (miRNAs) and human diseases have been identified. Based on accumulating biological data, many computational models for potential miRNA-disease associations inference have been developed, which saves time and expenditure on experimental studies, making great contributions to researching molecular mechanism of human diseases and developing new drugs for disease treatment. In this paper, we proposed a novel computational method named Ensemble of Decision Tree based MiRNA-Disease Association prediction (EDTMDA), which innovatively built a computational framework integrating ensemble learning and dimensionality reduction. For each miRNA-disease pair, the feature vector was extracted by calculating the statistical measures, graph theoretical measures, and matrix factorization results for the miRNA and disease, respectively. Then multiple base learnings were built to yield many decision trees (DTs) based on random selection of negative samples and miRNA/disease features. Particularly, Principal Components Analysis was applied to each base learning to reduce feature dimensionality and hence remove the noise or redundancy. Average strategy was adopted for these DTs to get final association scores between miRNAs and diseases. In model performance evaluation, EDTMDA showed AUC of 0.9309 in global leave-one-out cross validation (LOOCV) and AUC of 0.8524 in local LOOCV. Additionally, AUC of 0.9192+/-0.0009 in 5-fold cross validation proved the model’s reliability and stability. Furthermore, three types of case studies for four human diseases were implemented. As a result, 94% (Esophageal Neoplasms), 86% (Kidney Neoplasms), 96% (Breast Neoplasms) and 88% (Carcinoma Hepatocellular) of top 50 predicted miRNAs were confirmed by experimental evidences in literature. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
25. Large vessels as a tree of transmission lines incorporated in the CircAdapt whole-heart model: A computational tool to examine heart-vessel interaction.
- Author
-
Heusinkveld, Maarten H. G., Huberts, Wouter, Lumens, Joost, Arts, Theo, Delhaas, Tammo, and Reesink, Koen D.
- Subjects
ELECTRIC lines ,TIMBERLINE ,SYSTOLIC blood pressure ,BLOOD pressure ,CAROTID artery ,HEMODYNAMICS - Abstract
We developed a whole-circulation computational model by integrating a transmission line (TL) model describing vascular wave transmission into the established CircAdapt platform of whole-heart mechanics. In the present paper, we verify the numerical framework of our TL model by benchmark comparison to a previously validated pulse wave propagation (PWP) model. Additionally, we showcase the integrated CircAdapt–TL model, which now includes the heart as well as extensive arterial and venous trees with terminal impedances. We present CircAdapt–TL haemodynamics simulations of: 1) a systemic normotensive situation and 2) a systemic hypertensive situation. In the TL–PWP benchmark comparison we found good agreement regarding pressure and flow waveforms (relative errors ≤ 2.9% for pressure, and ≤ 5.6% for flow). CircAdapt–TL simulations reproduced the typically observed haemodynamic changes with hypertension, expressed by increases in mean and pulsatile blood pressures, and increased arterial pulse wave velocity. We observed a change in the timing of pressure augmentation (defined as a late-systolic boost in aortic pressure) from occurring after time of peak systolic pressure in the normotensive situation, to occurring prior to time of peak pressure in the hypertensive situation. The pressure augmentation could not be observed when the systemic circulation was lumped into a (non-linear) three-element windkessel model, instead of using our TL model. Wave intensity analysis at the carotid artery indicated earlier arrival of reflected waves with hypertension as compared to normotension, in good qualitative agreement with findings in patients. In conclusion, we successfully embedded a TL model as a vascular module into the CircAdapt platform. The integrated CircAdapt–TL model allows detailed studies on mechanistic studies on heart-vessel interaction. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
26. PrediTALE: A novel model learned from quantitative data allows for new perspectives on TALE targeting.
- Author
-
Erkes, Annett, Mücke, Stefanie, Reschke, Maik, Boch, Jens, and Grau, Jan
- Subjects
TANDEM repeats ,PLANT genes ,NUCLEOTIDE sequence ,COMPUTATIONAL biology ,GENE targeting ,FORKHEAD transcription factors - Abstract
Plant-pathogenic Xanthomonas bacteria secrete transcription activator-like effectors (TALEs) into host cells, where they act as transcriptional activators on plant target genes to support bacterial virulence. TALEs have a unique modular DNA-binding domain composed of tandem repeats. Two amino acids within each tandem repeat, termed repeat-variable diresidues, bind to contiguous nucleotides on the DNA sequence and determine target specificity. In this paper, we propose a novel approach for TALE target prediction to identify potential virulence targets. Our approach accounts for recent findings concerning TALE targeting, including frame-shift binding by repeats of aberrant lengths, and the flexible strand orientation of target boxes relative to the transcription start of the downstream target gene. The computational model can account for dependencies between adjacent RVD positions. Model parameters are learned from the wealth of quantitative data that have been generated over the last years. We benchmark the novel approach, termed PrediTALE, using RNA-seq data after Xanthomonas infection in rice, and find an overall improvement of prediction performance compared with previous approaches. Using PrediTALE, we are able to predict several novel putative virulence targets. However, we also observe that no target genes are predicted by any prediction tool for several TALEs, which we term orphan TALEs for this reason. We postulate that one explanation for orphan TALEs are incomplete gene annotations and, hence, propose to replace promoterome-wide by genome-wide scans for target boxes. We demonstrate that known targets from promoterome-wide scans may be recovered by genome-wide scans, whereas the latter, combined with RNA-seq data, are able to detect putative targets independent of existing gene annotations. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
27. A Bayesian framework for the analysis of systems biology models of the brain.
- Author
-
Russell-Buckland, Joshua, Barnes, Christopher P., and Tachtsidis, Ilias
- Subjects
BAYESIAN analysis ,BRAIN physiology ,SYSTEMS biology ,SENSITIVITY analysis ,MODELS & modelmaking - Abstract
Systems biology models are used to understand complex biological and physiological systems. Interpretation of these models is an important part of developing this understanding. These models are often fit to experimental data in order to understand how the system has produced various phenomena or behaviour that are seen in the data. In this paper, we have outlined a framework that can be used to perform Bayesian analysis of complex systems biology models. In particular, we have focussed on analysing a systems biology of the brain using both simulated and measured data. By using a combination of sensitivity analysis and approximate Bayesian computation, we have shown that it is possible to obtain distributions of parameters that can better guard against misinterpretation of results, as compared to a maximum likelihood estimate based approach. This is done through analysis of simulated and experimental data. NIRS measurements were simulated using the same simulated systemic input data for the model in a ‘healthy’ and ‘impaired’ state. By analysing both of these datasets, we show that different parameter spaces can be distinguished and compared between different physiological states or conditions. Finally, we analyse experimental data using the new Bayesian framework and the previous maximum likelihood estimate approach, showing that the Bayesian approach provides a more complete understanding of the parameter space. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
28. Chemical features mining provides new descriptive structure-odor relationships.
- Author
-
Licon, Carmen C., Bosc, Guillaume, Sabri, Mohammed, Mantel, Marylou, Fournel, Arnaud, Bushdid, Caroline, Golebiowski, Jerome, Robardet, Celine, Plantevit, Marc, Kaytoue, Mehdi, and Bensafi, Moustafa
- Subjects
ODORS ,COLOR vision ,PREDICTION models ,BIOLOGY ,ALGORITHMS - Abstract
An important goal in researching the biology of olfaction is to link the perception of smells to the chemistry of odorants. In other words, why do some odorants smell like fruits and others like flowers? While the so-called stimulus-percept issue was resolved in the field of color vision some time ago, the relationship between the chemistry and psycho-biology of odors remains unclear up to the present day. Although a series of investigations have demonstrated that this relationship exists, the descriptive and explicative aspects of the proposed models that are currently in use require greater sophistication. One reason for this is that the algorithms of current models do not consistently consider the possibility that multiple chemical rules can describe a single quality despite the fact that this is the case in reality, whereby two very different molecules can evoke a similar odor. Moreover, the available datasets are often large and heterogeneous, thus rendering the generation of multiple rules without any use of a computational approach overly complex. We considered these two issues in the present paper. First, we built a new database containing 1689 odorants characterized by physicochemical properties and olfactory qualities. Second, we developed a computational method based on a subgroup discovery algorithm that discriminated perceptual qualities of smells on the basis of physicochemical properties. Third, we ran a series of experiments on 74 distinct olfactory qualities and showed that the generation and validation of rules linking chemistry to odor perception was possible. Taken together, our findings provide significant new insights into the relationship between stimulus and percept in olfaction. In addition, by automatically extracting new knowledge linking chemistry of odorants and psychology of smells, our results provide a new computational framework of analysis enabling scientists in the field to test original hypotheses using descriptive or predictive modeling. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
29. Model diagnostics and refinement for phylodynamic models.
- Author
-
Lau, Max SY, Grenfell, Bryan T, Worby, Colin J, and Gibson, Gavin J
- Subjects
EPIDEMIOLOGY ,GENOMICS ,PATHOGENIC microorganisms ,BIOLOGICAL evolution ,LIFE sciences ,SUPERSPREADING events - Abstract
Phylodynamic modelling, which studies the joint dynamics of epidemiological and evolutionary processes, has made significant progress in recent years due to increasingly available genomic data and advances in statistical modelling. These advances have greatly improved our understanding of transmission dynamics of many important pathogens. Nevertheless, there remains a lack of effective, targetted diagnostic tools for systematically detecting model mis-specification. Development of such tools is essential for model criticism, refinement, and calibration. The idea of utilising latent residuals for model assessment has already been exploited in general spatio-temporal epidemiological settings. Specifically, by proposing appropriately designed non-centered, re-parameterizations of a given epidemiological process, one can construct latent residuals with known sampling distributions which can be used to quantify evidence of model mis-specification. In this paper, we extend this idea to formulate a novel model-diagnostic framework for phylodynamic models. Using simulated examples, we show that our framework may effectively detect a particular form of mis-specification in a phylodynamic model, particularly in the event of superspreading. We also exemplify our approach by applying the framework to a dataset describing a local foot-and-mouth (FMD) outbreak in the UK, eliciting strong evidence against the assumption of no within-host-diversity in the outbreak. We further demonstrate that our framework can facilitate model calibration in real-life scenarios, by proposing a within-host-diversity model which appears to offer a better fit to data than one that assumes no within-host-diversity of FMD virus. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
30. LMTRDA: Using logistic model tree to predict MiRNA-disease associations by fusing multi-source information of sequences and similarities.
- Author
-
Wang, Lei, You, Zhu-Hong, Chen, Xing, Li, Yang-Ming, Dong, Ya-Nan, Li, Li-Ping, and Zheng, Kai
- Subjects
LOGISTIC model (Demography) ,MICRORNA ,MEDICAL genetics ,RNA sequencing ,PREDICTION models ,BREAST tumors ,NATURAL language processing ,LYMPHOMA diagnosis - Abstract
Emerging evidence has shown microRNAs (miRNAs) play an important role in human disease research. Identifying potential association among them is significant for the development of pathology, diagnose and therapy. However, only a tiny portion of all miRNA-disease pairs in the current datasets are experimentally validated. This prompts the development of high-precision computational methods to predict real interaction pairs. In this paper, we propose a new model of Logistic Model Tree for predicting miRNA-Disease Association (LMTRDA) by fusing multi-source information including miRNA sequences, miRNA functional similarity, disease semantic similarity, and known miRNA-disease associations. In particular, we introduce miRNA sequence information and extract its features using natural language processing technique for the first time in the miRNA-disease prediction model. In the cross-validation experiment, LMTRDA obtained 90.51% prediction accuracy with 92.55% sensitivity at the AUC of 90.54% on the HMDD V3.0 dataset. To further evaluate the performance of LMTRDA, we compared it with different classifier and feature descriptor models. In addition, we also validate the predictive ability of LMTRDA in human diseases including Breast Neoplasms, Breast Neoplasms and Lymphoma. As a result, 28, 27 and 26 out of the top 30 miRNAs associated with these diseases were verified by experiments in different kinds of case studies. These experimental results demonstrate that LMTRDA is a reliable model for predicting the association among miRNAs and diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
31. Predicting the mechanism and rate of H-NS binding to AT-rich DNA.
- Author
-
Riccardi, Enrico, van Mastbergen, Eva C., Navarre, William Wiley, and Vreede, Jocelyne
- Subjects
BACTERIA ,ARGININE ,DNA ,BIOCHEMISTRY ,PROTEINS - Abstract
Bacteria contain several nucleoid-associated proteins that organize their genomic DNA into the nucleoid by bending, wrapping or bridging DNA. The Histone-like Nucleoid Structuring protein H-NS found in many Gram-negative bacteria is a DNA bridging protein and can structure DNA by binding to two separate DNA duplexes or to adjacent sites on the same duplex, depending on external conditions. Several nucleotide sequences have been identified to which H-NS binds with high affinity, indicating H-NS prefers AT-rich DNA. To date, highly detailed structural information of the H-NS DNA complex remains elusive. Molecular simulation can complement experiments by modelling structures and their time evolution in atomistic detail. In this paper we report an exploration of the different binding modes of H-NS to a high affinity nucleotide sequence and an estimate of the associated rate constant. By means of molecular dynamics simulations, we identified three types of binding for H-NS to AT-rich DNA. To further sample the transitions between these binding modes, we performed Replica Exchange Transition Interface Sampling, providing predictions of the mechanism and rate constant of H-NS binding to DNA. H-NS interacts with the DNA through a conserved QGR motif, aided by a conserved arginine at position 93. The QGR motif interacts first with phosphate groups, followed by the formation of hydrogen bonds between acceptors in the DNA minor groove and the sidechains of either Q112 or R114. After R114 inserts into the minor groove, the rest of the QGR motif follows. Full insertion of the QGR motif in the minor groove is stable over several tens of nanoseconds, and involves hydrogen bonds between the bases and both backbone and sidechains of the QGR motif. The rate constant for the process of H-NS binding to AT-rich DNA resulting in full insertion of the QGR motif is in the order of 10
6 M−1 s−1 , which is rate limiting compared to the non-specific association of H-NS to the DNA backbone at a rate of 108 M−1 s−1 . [ABSTRACT FROM AUTHOR]- Published
- 2019
- Full Text
- View/download PDF
32. A data-driven interactome of synergistic genes improves network-based cancer outcome prediction.
- Author
-
Allahyar, Amin, Ubels, Joske, and de Ridder, Jeroen
- Subjects
CANCER patients ,GENE expression ,CANCER treatment ,HEALTH outcome assessment ,MOLECULAR genetics - Abstract
Robustly predicting outcome for cancer patients from gene expression is an important challenge on the road to better personalized treatment. Network-based outcome predictors (NOPs), which considers the cellular wiring diagram in the classification, hold much promise to improve performance, stability and interpretability of identified marker genes. Problematically, reports on the efficacy of NOPs are conflicting and for instance suggest that utilizing random networks performs on par to networks that describe biologically relevant interactions. In this paper we turn the prediction problem around: instead of using a given biological network in the NOP, we aim to identify the network of genes that truly improves outcome prediction. To this end, we propose SyNet, a gene network constructed ab initio from synergistic gene pairs derived from survival-labelled gene expression data. To obtain SyNet, we evaluate synergy for all 69 million pairwise combinations of genes resulting in a network that is specific to the dataset and phenotype under study and can be used to in a NOP model. We evaluated SyNet and 11 other networks on a compendium dataset of >4000 survival-labelled breast cancer samples. For this purpose, we used cross-study validation which more closely emulates real world application of these outcome predictors. We find that SyNet is the only network that truly improves performance, stability and interpretability in several existing NOPs. We show that SyNet overlaps significantly with existing gene networks, and can be confidently predicted (~85% AUC) from graph-topological descriptions of these networks, in particular the breast tissue-specific network. Due to its data-driven nature, SyNet is not biased to well-studied genes and thus facilitates post-hoc interpretation. We find that SyNet is highly enriched for known breast cancer genes and genes related to e.g. histological grade and tamoxifen resistance, suggestive of a role in determining breast cancer outcome. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
33. Thermodynamic model of gene regulation for the Or59b olfactory receptor in Drosophila.
- Author
-
González, Alejandra, Jafari, Shadi, Zenere, Alberto, Alenius, Mattias, and Altafini, Claudio
- Subjects
OLFACTORY receptors ,GENETIC regulation ,DROSOPHILA ,EUKARYOTES ,TRANSCRIPTION factors ,THERMODYNAMICS - Abstract
Complex eukaryotic promoters normally contain multiple cis-regulatory sequences for different transcription factors (TFs). The binding patterns of the TFs to these sites, as well as the way the TFs interact with each other and with the RNA polymerase (RNAp), lead to combinatorial problems rarely understood in detail, especially under varying epigenetic conditions. The aim of this paper is to build a model describing how the main regulatory cluster of the olfactory receptor Or59b drives transcription of this gene in Drosophila. The cluster-driven expression of this gene is represented as the equilibrium probability of RNAp being bound to the promoter region, using a statistical thermodynamic approach. The RNAp equilibrium probability is computed in terms of the occupancy probabilities of the single TFs of the cluster to the corresponding binding sites, and of the interaction rules among TFs and RNAp, using experimental data of Or59b expression to tune the model parameters. The model reproduces correctly the changes in RNAp binding probability induced by various mutation of specific sites and epigenetic modifications. Some of its predictions have also been validated in novel experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
34. Global analysis of N6-methyladenosine functions and its disease association using deep learning and network-based methods.
- Author
-
Zhang, Song-yao, Zhang, Shao-wu, Fan, Xiao-nan, Meng, Jia, Chen, Yidong, Gao, Shou-Jiang, and Huang, Yufei
- Subjects
PHYSIOLOGICAL effects of adenosine ,DEEP learning ,MESSENGER RNA ,PROTEIN-protein interactions ,CELL proliferation - Abstract
N6-methyladenosine (m
6 A) is the most abundant methylation, existing in >25% of human mRNAs. Exciting recent discoveries indicate the close involvement of m6 A in regulating many different aspects of mRNA metabolism and diseases like cancer. However, our current knowledge about how m6 A levels are controlled and whether and how regulation of m6 A levels of a specific gene can play a role in cancer and other diseases is mostly elusive. We propose in this paper a computational scheme for predicting m6 A-regulated genes and m6 A-associated disease, which includes Deep-m6 A, the first model for detecting condition-specific m6 A sites from MeRIP-Seq data with a single base resolution using deep learning and Hot-m6 A, a new network-based pipeline that prioritizes functional significant m6 A genes and its associated diseases using the Protein-Protein Interaction (PPI) and gene-disease heterogeneous networks. We applied Deep-m6 A and this pipeline to 75 MeRIP-seq human samples, which produced a compact set of 709 functionally significant m6 A-regulated genes and nine functionally enriched subnetworks. The functional enrichment analysis of these genes and networks reveal that m6 A targets key genes of many critical biological processes including transcription, cell organization and transport, and cell proliferation and cancer-related pathways such as Wnt pathway. The m6 A-associated disease analysis prioritized five significantly associated diseases including leukemia and renal cell carcinoma. These results demonstrate the power of our proposed computational scheme and provide new leads for understanding m6 A regulatory functions and its roles in diseases. [ABSTRACT FROM AUTHOR]- Published
- 2019
- Full Text
- View/download PDF
35. Ten quick tips for sharing open genomic data.
- Author
-
Brown, Anne V., Campbell, Jacqueline D., Assefa, Teshale, Grant, David, Nelson, Rex T., Weeks, Nathan T., and Cannon, Steven B.
- Subjects
GENOMICS ,BIOLOGICAL databases ,NUCLEOTIDE sequencing ,DATA curation ,DNA data banks - Abstract
As sequencing prices drop, genomic data accumulates—seemingly at a steadily increasing pace. Most genomic data potentially have value beyond the initial purpose—but only if shared with the scientific community. This, of course, is often easier said than done. Some of the challenges in sharing genomic data include data volume (raw file sizes and number of files), complexities, formats, nomenclatures, metadata descriptions, and the choice of a repository. In this paper, we describe 10 quick tips for sharing open genomic data. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
36. On variational solutions for whole brain serial-section histology using a Sobolev prior in the computational anatomy random orbit model.
- Author
-
Lee, Brian C., Tward, Daniel J., Mitra, Partha P., and Miller, Michael I.
- Subjects
HISTOLOGICAL techniques ,DIFFEOMORPHISMS ,HISTOLOGY ,BRAIN ,MICE - Abstract
This paper presents a variational framework for dense diffeomorphic atlas-mapping onto high-throughput histology stacks at the 20 μm meso-scale. The observed sections are modelled as Gaussian random fields conditioned on a sequence of unknown section by section rigid motions and unknown diffeomorphic transformation of a three-dimensional atlas. To regularize over the high-dimensionality of our parameter space (which is a product space of the rigid motion dimensions and the diffeomorphism dimensions), the histology stacks are modelled as arising from a first order Sobolev space smoothness prior. We show that the joint maximum a-posteriori, penalized-likelihood estimator of our high dimensional parameter space emerges as a joint optimization interleaving rigid motion estimation for histology restacking and large deformation diffeomorphic metric mapping to atlas coordinates. We show that joint optimization in this parameter space solves the classical curvature non-identifiability of the histology stacking problem. The algorithms are demonstrated on a collection of whole-brain histological image stacks from the Mouse Brain Architecture Project. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
37. SFPEL-LPI: Sequence-based feature projection ensemble learning for predicting LncRNA-protein interactions.
- Author
-
Zhang, Wen, Tang, Guifeng, Huang, Feng, Zhang, Xining, Yue, Xiang, and Wu, Wenjian
- Subjects
RNA-protein interactions ,GENETIC regulation ,RNA interference ,RNA splicing ,ADENYLATION (Biochemistry) - Abstract
LncRNA-protein interactions play important roles in post-transcriptional gene regulation, poly-adenylation, splicing and translation. Identification of lncRNA-protein interactions helps to understand lncRNA-related activities. Existing computational methods utilize multiple lncRNA features or multiple protein features to predict lncRNA-protein interactions, but features are not available for all lncRNAs or proteins; most of existing methods are not capable of predicting interacting proteins (or lncRNAs) for new lncRNAs (or proteins), which don’t have known interactions. In this paper, we propose the sequence-based feature projection ensemble learning method, “SFPEL-LPI”, to predict lncRNA-protein interactions. First, SFPEL-LPI extracts lncRNA sequence-based features and protein sequence-based features. Second, SFPEL-LPI calculates multiple lncRNA-lncRNA similarities and protein-protein similarities by using lncRNA sequences, protein sequences and known lncRNA-protein interactions. Then, SFPEL-LPI combines multiple similarities and multiple features with a feature projection ensemble learning frame. In computational experiments, SFPEL-LPI accurately predicts lncRNA-protein associations and outperforms other state-of-the-art methods. More importantly, SFPEL-LPI can be applied to new lncRNAs (or proteins). The case studies demonstrate that our method can find out novel lncRNA-protein interactions, which are confirmed by literature. Finally, we construct a user-friendly web server, available at . [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
38. Bayesian adaptive dual control of deep brain stimulation in a computational model of Parkinson’s disease.
- Author
-
Grado, Logan L., Johnson, Matthew D., and Netoff, Theoden I.
- Subjects
BAYESIAN analysis ,PROBABILITY theory ,BRAIN stimulation ,KINDLING (Neurology) ,TRANSCRANIAL magnetic stimulation - Abstract
In this paper, we present a novel Bayesian adaptive dual controller (ADC) for autonomously programming deep brain stimulation devices. We evaluated the Bayesian ADC’s performance in the context of reducing beta power in a computational model of Parkinson’s disease, in which it was tasked with finding the set of stimulation parameters which optimally reduced beta power as fast as possible. Here, the Bayesian ADC has dual goals: (a) to minimize beta power by exploiting the best parameters found so far, and (b) to explore the space to find better parameters, thus allowing for better control in the future. The Bayesian ADC is composed of two parts: an inner parameterized feedback stimulator and an outer parameter adjustment loop. The inner loop operates on a short time scale, delivering stimulus based upon the phase and power of the beta oscillation. The outer loop operates on a long time scale, observing the effects of the stimulation parameters and using Bayesian optimization to intelligently select new parameters to minimize the beta power. We show that the Bayesian ADC can efficiently optimize stimulation parameters, and is superior to other optimization algorithms. The Bayesian ADC provides a robust and general framework for tuning stimulation parameters, can be adapted to use any feedback signal, and is applicable across diseases and stimulator designs. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
39. Efficient pedigree recording for fast population genetics simulation.
- Author
-
Kelleher, Jerome, Thornton, Kevin R., Ashander, Jaime, and Ralph, Peter L.
- Subjects
POPULATION genetics ,EUKARYOTES ,PHYLOGENY ,GENOTYPES ,ALGORITHMS - Abstract
In this paper we describe how to efficiently record the entire genetic history of a population in forwards-time, individual-based population genetics simulations with arbitrary breeding models, population structure and demography. This approach dramatically reduces the computational burden of tracking individual genomes by allowing us to simulate only those loci that may affect reproduction (those having non-neutral variants). The genetic history of the population is recorded as a succinct tree sequence as introduced in the software package msprime, on which neutral mutations can be quickly placed afterwards. Recording the results of each breeding event requires storage that grows linearly with time, but there is a great deal of redundancy in this information. We solve this storage problem by providing an algorithm to quickly ‘simplify’ a tree sequence by removing this irrelevant history for a given set of genomes. By periodically simplifying the history with respect to the extant population, we show that the total storage space required is modest and overall large efficiency gains can be made over classical forward-time simulations. We implement a general-purpose framework for recording and simplifying genealogical data, which can be used to make simulations of any population model more efficient. We modify two popular forwards-time simulation frameworks to use this new approach and observe efficiency gains in large, whole-genome simulations of one to two orders of magnitude. In addition to speed, our method for recording pedigrees has several advantages: (1) All marginal genealogies of the simulated individuals are recorded, rather than just genotypes. (2) A population of N individuals with M polymorphic sites can be stored in O(N log N + M) space, making it feasible to store a simulation’s entire final generation as well as its history. (3) A simulation can easily be initialized with a more efficient coalescent simulation of deep history. The software for recording and processing tree sequences is named tskit. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
40. Predicting B cell receptor substitution profiles using public repertoire data.
- Author
-
Dhar, Amrit, Davidsen, Kristian, IVMatsen, Frederick A., and Minin, Vladimir N.
- Subjects
B cell receptors ,AMINO acids ,GENETIC mutation ,CLONING ,GERMINAL centers ,IMMUNOTECHNOLOGY - Abstract
B cells develop high affinity receptors during the course of affinity maturation, a cyclic process of mutation and selection. At the end of affinity maturation, a number of cells sharing the same ancestor (i.e. in the same “clonal family”) are released from the germinal center; their amino acid frequency profile reflects the allowed and disallowed substitutions at each position. These clonal-family-specific frequency profiles, called “substitution profiles”, are useful for studying the course of affinity maturation as well as for antibody engineering purposes. However, most often only a single sequence is recovered from each clonal family in a sequencing experiment, making it impossible to construct a clonal-family-specific substitution profile. Given the public release of many high-quality large B cell receptor datasets, one may ask whether it is possible to use such data in a prediction model for clonal-family-specific substitution profiles. In this paper, we present the method “Substitution Profiles Using Related Families” (SPURF), a penalized tensor regression framework that integrates information from a rich assemblage of datasets to predict the clonal-family-specific substitution profile for any single input sequence. Using this framework, we show that substitution profiles from similar clonal families can be leveraged together with simulated substitution profiles and germline gene sequence information to improve prediction. We fit this model on a large public dataset and validate the robustness of our approach on two external datasets. Furthermore, we provide a command-line tool in an open-source software package () implementing these ideas and providing easy prediction using our pre-fit models. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
41. Ten simple rules for scientists: Improving your writing productivity.
- Author
-
Peterson, Todd C., Kleppner, Sofie R., and Botham, Crystal M.
- Subjects
WORK environment ,SELF-talk - Abstract
An introduction to the journal is presented in which the editor discusses various reports in the issue on the topics including the importance of writing in science research, developing a working environment in workplace, and managing self talk about writing.
- Published
- 2018
- Full Text
- View/download PDF
42. From Spontaneous Motor Activity to Coordinated Behaviour: A Developmental Model.
- Author
-
Marques, Hugo Gravato, Bharadwaj, Arjun, and Iida, Fumiya
- Subjects
MOTOR neurons ,DEVELOPMENTAL neurobiology ,MACHINE learning ,COMPUTATIONAL neuroscience ,REFLEXES ,MAMMAL behavior - Abstract
In mammals, the developmental path that links the primary behaviours observed during foetal stages to the full fledged behaviours observed in adults is still beyond our understanding. Often theories of motor control try to deal with the process of incremental learning in an abstract and modular way without establishing any correspondence with the mammalian developmental stages. In this paper, we propose a computational model that links three distinct behaviours which appear at three different stages of development. In order of appearance, these behaviours are: spontaneous motor activity (SMA), reflexes, and coordinated behaviours, such as locomotion. The goal of our model is to address in silico four hypotheses that are currently hard to verify in vivo: First, the hypothesis that spinal reflex circuits can be self-organized from the sensor and motor activity induced by SMA. Second, the hypothesis that supraspinal systems can modulate reflex circuits to achieve coordinated behaviour. Third, the hypothesis that, since SMA is observed in an organism throughout its entire lifetime, it provides a mechanism suitable to maintain the reflex circuits aligned with the musculoskeletal system, and thus adapt to changes in body morphology. And fourth, the hypothesis that by changing the modulation of the reflex circuits over time, one can switch between different coordinated behaviours. Our model is tested in a simulated musculoskeletal leg actuated by six muscles arranged in a number of different ways. Hopping is used as a case study of coordinated behaviour. Our results show that reflex circuits can be self-organized from SMA, and that, once these circuits are in place, they can be modulated to achieve coordinated behaviour. In addition, our results show that our model can naturally adapt to different morphological changes and perform behavioural transitions. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
43. Ten simple rules for measuring the impact of workshops.
- Author
-
Sufi, Shoaib, Nenadic, Aleksandra, Silva, Raniere, Balzano, Melissa, Coelho, Sara, Ford, Heather, Jones, Catherine, Higgins, Vanessa, Duckles, Beth, Simera, Iveta, de Beyer, Jennifer A., Struthers, Caroline, Nurmikko-Fuller, Terhi, Bellis, Louisa, Miah, Wadud, Wilde, Adriana, Emsley, Iain, and Philippe, Olivier
- Subjects
FORUMS ,RESEARCH ,DECISION making ,STRATEGIC planning ,PARTICIPATION - Abstract
Workshops are used to explore a specific topic, to transfer knowledge, to solve identified problems, or to create something new. In funded research projects and other research endeavours, workshops are the mechanism used to gather the wider project, community, or interested people together around a particular topic. However, natural questions arise: how do we measure the impact of these workshops? Do we know whether they are meeting the goals and objectives we set for them? What indicators should we use? In response to these questions, this paper will outline rules that will improve the measurement of the impact of workshops. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
44. A marginalized two-part Beta regression model for microbiome compositional data.
- Author
-
Chai, Haitao, Jiang, Hongmei, Lin, Lu, and Liu, Lei
- Subjects
MICROORGANISMS ,HUMAN microbiota ,REGRESSION analysis ,PUBLIC health ,METAGENOMICS - Abstract
In microbiome studies, an important goal is to detect differential abundance of microbes across clinical conditions and treatment options. However, the microbiome compositional data (quantified by relative abundance) are highly skewed, bounded in [0, 1), and often have many zeros. A two-part model is commonly used to separate zeros and positive values explicitly by two submodels: a logistic model for the probability of a specie being present in Part I, and a Beta regression model for the relative abundance conditional on the presence of the specie in Part II. However, the regression coefficients in Part II cannot provide a marginal (unconditional) interpretation of covariate effects on the microbial abundance, which is of great interest in many applications. In this paper, we propose a marginalized two-part Beta regression model which captures the zero-inflation and skewness of microbiome data and also allows investigators to examine covariate effects on the marginal (unconditional) mean. We demonstrate its practical performance using simulation studies and apply the model to a real metagenomic dataset on mouse skin microbiota. We find that under the proposed marginalized model, without loss in power, the likelihood ratio test performs better in controlling the type I error than those under conventional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
45. 3D morphology-based clustering and simulation of human pyramidal cell dendritic spines.
- Author
-
Luengo-Sanchez, Sergio, Fernaud-Espinosa, Isabel, Bielza, Concha, Benavides-Piccione, Ruth, Larrañaga, Pedro, and DeFelipe, Javier
- Subjects
DENDRITIC cells ,PYRAMIDAL neurons ,DENDRITES ,NEURONS ,CEREBRAL cortex ,DENDRITIC spines ,BRAIN mapping - Abstract
The dendritic spines of pyramidal neurons are the targets of most excitatory synapses in the cerebral cortex. They have a wide variety of morphologies, and their morphology appears to be critical from the functional point of view. To further characterize dendritic spine geometry, we used in this paper over 7,000 individually 3D reconstructed dendritic spines from human cortical pyramidal neurons to group dendritic spines using model-based clustering. This approach uncovered six separate groups of human dendritic spines. To better understand the differences between these groups, the discriminative characteristics of each group were identified as a set of rules. Model-based clustering was also useful for simulating accurate 3D virtual representations of spines that matched the morphological definitions of each cluster. This mathematical approach could provide a useful tool for theoretical predictions on the functional features of human pyramidal neurons based on the morphology of dendritic spines. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
46. The effect of cell geometry on polarization in budding yeast.
- Author
-
Trogdon, Michael, Drawert, Brian, Gomez, Carlos, Banavar, Samhita P., Yi, Tau-Mu, Campàs, Otger, and Petzold, Linda R.
- Subjects
SACCHAROMYCES cerevisiae ,STEM cells ,BIOLOGICAL evolution ,GENETIC transcription ,SYNTHETIC biology - Abstract
The localization (or polarization) of proteins on the membrane during the mating of budding yeast (Saccharomyces cerevisiae) is an important model system for understanding simple pattern formation within cells. While there are many existing mathematical models of polarization, for both budding and mating, there are still many aspects of this process that are not well understood. In this paper we set out to elucidate the effect that the geometry of the cell can have on the dynamics of certain models of polarization. Specifically, we look at several spatial stochastic models of Cdc42 polarization that have been adapted from published models, on a variety of tip-shaped geometries, to replicate the shape change that occurs during the growth of the mating projection. We show here that there is a complex interplay between the dynamics of polarization and the shape of the cell. Our results show that while models of polarization can generate a stable polarization cap, its localization at the tip of mating projections is unstable, with the polarization cap drifting away from the tip of the projection in a geometry dependent manner. We also compare predictions from our computational results to experiments that observe cells with projections of varying lengths, and track the stability of the polarization cap. Lastly, we examine one model of actin polarization and show that it is unlikely, at least for the models studied here, that actin dynamics and vesicle traffic are able to overcome this effect of geometry. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
47. The role of intracellular signaling in the stripe formation in engineered Escherichia coli populations.
- Author
-
Xue, Xiaoru, Xue, Chuan, and Tang, Min
- Subjects
ESCHERICHIA coli enzymes ,ESCHERICHIA coli proteins ,ESCHERICHIA coli physiology ,COMPUTATIONAL biology ,CELL division - Abstract
Recent experiments showed that engineered Escherichia coli colonies grow and self-organize into periodic stripes with high and low cell densities in semi-solid agar. The stripes develop sequentially behind a radially propagating colony front, similar to the formation of many other periodic patterns in nature. These bacteria were created by genetically coupling the intracellular chemotaxis pathway of wild-type cells with a quorum sensing module through the protein CheZ. In this paper, we develop multiscale models to investigate how this intracellular pathway affects stripe formation. We first develop a detailed hybrid model that treats each cell as an individual particle and incorporates intracellular signaling via an internal ODE system. To overcome the computational cost of the hybrid model caused by the large number of cells involved, we next derive a mean-field PDE model from the hybrid model using asymptotic analysis. We show that this analysis is justified by the tight agreement between the PDE model and the hybrid model in 1D simulations. Numerical simulations of the PDE model in 2D with radial symmetry agree with experimental data semi-quantitatively. Finally, we use the PDE model to make a number of testable predictions on how the stripe patterns depend on cell-level parameters, including cell speed, cell doubling time and the turnover rate of intracellular CheZ. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
48. Simulations to benchmark time-varying connectivity methods for fMRI.
- Author
-
Thompson, William Hedley, Richter, Craig Geoffrey, Plavén-Sigray, Pontus, and Fransson, Peter
- Subjects
FUNCTIONAL magnetic resonance imaging ,BRAIN imaging ,SIMULATION methods & models ,MULTIPLICATION ,ANALYSIS of covariance - Abstract
There is a current interest in quantifying time-varying connectivity (TVC) based on neuroimaging data such as fMRI. Many methods have been proposed, and are being applied, revealing new insight into the brain’s dynamics. However, given that the ground truth for TVC in the brain is unknown, many concerns remain regarding the accuracy of proposed estimates. Since there exist many TVC methods it is difficult to assess differences in time-varying connectivity between studies. In this paper, we present tvc_benchmarker, which is a Python package containing four simulations to test TVC methods. Here, we evaluate five different methods that together represent a wide spectrum of current approaches to estimating TVC (sliding window, tapered sliding window, multiplication of temporal derivatives, spatial distance and jackknife correlation). These simulations were designed to test each method’s ability to track changes in covariance over time, which is a key property in TVC analysis. We found that all tested methods correlated positively with each other, but there were large differences in the strength of the correlations between methods. To facilitate comparisons with future TVC methods, we propose that the described simulations can act as benchmark tests for evaluation of methods. Using tvc_benchmarker researchers can easily add, compare and submit their own TVC methods to evaluate its performance. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
49. Predictive modelling of a novel anti-adhesion therapy to combat bacterial colonisation of burn wounds.
- Author
-
Roberts, Paul A., Huebinger, Ryan M., Keen, Emma, Krachler, Anne-Marie, and Jabbari, Sara
- Subjects
TREATMENT for burns & scalds ,ANTIBIOTICS ,DRUG resistance in bacteria ,COLONIZATION (Ecology) ,DRUG development - Abstract
As the development of new classes of antibiotics slows, bacterial resistance to existing antibiotics is becoming an increasing problem. A potential solution is to develop treatment strategies with an alternative mode of action. We consider one such strategy: anti-adhesion therapy. Whereas antibiotics act directly upon bacteria, either killing them or inhibiting their growth, anti-adhesion therapy impedes the binding of bacteria to host cells. This prevents bacteria from deploying their arsenal of virulence mechanisms, while simultaneously rendering them more susceptible to natural and artificial clearance. In this paper, we consider a particular form of anti-adhesion therapy, involving biomimetic multivalent adhesion molecule 7 coupled polystyrene microbeads, which competitively inhibit the binding of bacteria to host cells. We develop a mathematical model, formulated as a system of ordinary differential equations, to describe inhibitor treatment of a Pseudomonas aeruginosa burn wound infection in the rat. Benchmarking our model against in vivo data from an ongoing experimental programme, we use the model to explain bacteria population dynamics and to predict the efficacy of a range of treatment strategies, with the aim of improving treatment outcome. The model consists of two physical compartments: the host cells and the exudate. It is found that, when effective in reducing the bacterial burden, inhibitor treatment operates both by preventing bacteria from binding to the host cells and by reducing the flux of daughter cells from the host cells into the exudate. Our model predicts that inhibitor treatment cannot eliminate the bacterial burden when used in isolation; however, when combined with regular or continuous debridement of the exudate, elimination is theoretically possible. Lastly, we present ways to improve therapeutic efficacy, as predicted by our mathematical model. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
50. Correcting for batch effects in case-control microbiome studies.
- Author
-
Gibbons, Sean M., Duvallet, Claire, and Alm, Eric J.
- Subjects
CASE-control method ,MICROARRAY technology ,RNA ,MICROBIAL genomics - Abstract
High-throughput data generation platforms, like mass-spectrometry, microarrays, and second-generation sequencing are susceptible to batch effects due to run-to-run variation in reagents, equipment, protocols, or personnel. Currently, batch correction methods are not commonly applied to microbiome sequencing datasets. In this paper, we compare different batch-correction methods applied to microbiome case-control studies. We introduce a model-free normalization procedure where features (i.e. bacterial taxa) in case samples are converted to percentiles of the equivalent features in control samples within a study prior to pooling data across studies. We look at how this percentile-normalization method compares to traditional meta-analysis methods for combining independent p-values and to limma and ComBat, widely used batch-correction models developed for RNA microarray data. Overall, we show that percentile-normalization is a simple, non-parametric approach for correcting batch effects and improving sensitivity in case-control meta-analyses. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.