Descriptor: "R PACKAGE" / Database: Complementary Index - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"R PACKAGE"' showing total 380 results

Start Over Descriptor "R PACKAGE" Database Complementary Index

380 results on '"R PACKAGE"'

1. Person explanatory multidimensional item response theory with the instrument package in R.

Author: Kleinsasser, Michael J., Mistry, Ritesh, Hsieh, Hsing-Fang, McCarthy, William J., and Raghunathan, Trivellore
Subjects: FIXED effects model, RANDOM effects model, BAYESIAN analysis, MODEL theory, REGRESSION analysis, ITEM response theory
Abstract: We present the new R package instrument to perform Bayesian estimation of person explanatory multidimensional item response theory. The package implements an exploratory multidimensional item response theory model and a higher-order multidimensional item response theory model, a type of confirmatory multidimensional item response theory. Explanation of person parameters is accomplished by fixed and random effect linear regression models. Estimation is carried out using Hamiltonian Monte Carlo in Stan. In this article, we provide a detailed description of the models; we use the instrument package to demonstrate fitting explanatory item response models with fixed and random effects (i.e., mixed modeling) of person parameters in R; and, we perform a simulation study to evaluate the performance of our implementation of the models. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. A Next Generation of Hierarchical Bayesian Analyses of Hybrid Zones Enables Model‐Based Quantification of Variation in Introgression in R.

Author: Gompert, Zachariah, DeRaad, Devon A., and Buerkle, C. Alex
Abstract: Hybrid zones, where genetically distinct groups of organisms meet and interbreed, offer valuable insights into the nature of species and speciation. Here, we present a new R package, bgchm, for population genomic analyses of hybrid zones. This R package extends and updates the existing bgc software and combines Bayesian analyses of hierarchical genomic clines with Bayesian methods for estimating hybrid indexes, interpopulation ancestry proportions, and geographic clines. Compared to existing software, bgchm offers enhanced efficiency through Hamiltonian Monte Carlo sampling and the ability to work with genotype likelihoods combined with a hierarchical Bayesian approach, enabling inference for diverse types of genetic data sets. The package also facilitates the quantification of introgression patterns across genomes, which is crucial for understanding reproductive isolation and speciation genetics. We first describe the models underlying bgchm and then provide an overview of the R package and illustrate its use through the analysis of simulated and empirical data sets. We show that bgchm generates accurate estimates of model parameters under a variety of conditions, especially when the genetic loci analyzed are highly ancestry informative. This includes relatively robust estimates of genome‐wide variability in clines, which has not been the focus of previous models and methods. We also illustrate how both selection and genetic drift contribute to variability in introgression among loci and how additional information can be used to help distinguish these contributions. We conclude by describing the promises and limitations of bgchm, comparing bgchm to other software for genomic cline analyses, and identifying areas for fruitful future development. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. mulea: An R package for enrichment analysis using multiple ontologies and empirical false discovery rate.

Author: Turek, Cezary, Ölbei, Márton, Stirling, Tamás, Fekete, Gergely, Tasnádi, Ervin, Gul, Leila, Bohár, Balázs, Papp, Balázs, Jurkowski, Wiktor, and Ari, Eszter
Subjects: FALSE discovery rate, REGULATOR genes, PROTEIN domains, GENE ontology, FUNCTIONAL analysis
Abstract: Traditional gene set enrichment analyses are typically limited to a few ontologies and do not account for the interdependence of gene sets or terms, resulting in overcorrected p-values. To address these challenges, we introduce mulea, an R package offering comprehensive overrepresentation and functional enrichment analysis. mulea employs a progressive empirical false discovery rate (eFDR) method, specifically designed for interconnected biological data, to accurately identify significant terms within diverse ontologies. mulea expands beyond traditional tools by incorporating a wide range of ontologies, encompassing Gene Ontology, pathways, regulatory elements, genomic locations, and protein domains. This flexibility enables researchers to tailor enrichment analysis to their specific questions, such as identifying enriched transcriptional regulators in gene expression data or overrepresented protein domains in protein sets. To facilitate seamless analysis, mulea provides gene sets (in standardised GMT format) for 27 model organisms, covering 22 ontology types from 16 databases and various identifiers resulting in almost 900 files. Additionally, the muleaData ExperimentData Bioconductor package simplifies access to these pre-defined ontologies. Finally, mulea's architecture allows for easy integration of user-defined ontologies, or GMT files from external sources (e.g., MSigDB or Enrichr), expanding its applicability across diverse research areas. mulea is distributed as a CRAN R package downloadable from https://cran.r-project.org/web/packages/mulea/ and https://github.com/ELTEbioinformatics/mulea. It offers researchers a powerful and flexible toolkit for functional enrichment analysis, addressing limitations of traditional tools with its progressive eFDR and by supporting a variety of ontologies. Overall, mulea fosters the exploration of diverse biological questions across various model organisms. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. detectCilia: An R Package for Automated Detection and Length Measurement of Primary Cilia.

Author: Budde-Sagert, Kai, Krueger, Simone, Sehlke, Clemens, Lemcke, Heiko, Jonitz-Heincke, Anika, David, Robert, Bader, Rainer, and Uhrmacher, Adelinde M
Subjects: TRANSFORMING growth factors-beta, SOMATOMEDIN C, CONFOCAL microscopy, LENGTH measurement, CELL communication, CARTILAGE regeneration
Abstract: Background and objective: The primary cilium is a small protrusion found on most mammalian cells. It acts as a cellular antenna, being involved in various cell signaling pathways. The length of the primary cilium affects its function. To study the impact of physical or chemical stimuli on cilia, their lengths must be determined easily and reproducibly. Methods: We have developed and evaluated an open-source R package called detectCilia to detect and measure primary cilia automatically. As a case study to demonstrate the capability of our tool, we compared the influence of 4 different cell culture media compositions on the lengths of primary cilia in human chondrocytes. These media compositions include (1) insulin-transferrin-selenium (ITS); (2) ITS and dexamethasone (Dexa); (3) ITS, Dexa, insulin-like growth factor 1 (IGF-1), and transforming growth factor beta 1 (TGF-β1); and (4) fetal bovine serum (FBS). Results: The assessment of detectCilia included a comparison with 2 similar tools: ACDC (Automated Cilia Detection in Cells) and CiliaQ. Several differences and advantages of our package make it a valuable addition to these tools. In the case study, we have observed variations in the ciliary lengths associated with using different media compositions. Conclusions: We conclude that detectCilia can automatically and reproducibly detect and measure primary cilia in confocal microscopy images with low false-positive rates without requiring extensive user interaction. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Visual Integration of Genome-Wide Association Studies and Differential Expression Results with the Hidecan R Package.

Author: Angelin-Bonnet, Olivia, Vignes, Matthieu, Biggs, Patrick J., Baldwin, Samantha, and Thomson, Susan
Abstract: Background/Objectives: We present hidecan, an R package for generating visualisations that summarise the results of one or more genome-wide association studies (GWAS) and differential expression analyses, as well as manually curated candidate genes, e.g., extracted from the literature. This tool is applicable to all ploidy levels; we notably provide functionalities to facilitate the visualisation of GWAS results obtained for autotetraploid organisms with the GWASpoly package. Results: We illustrate the capabilities of hidecan with examples from two autotetraploid potato datasets. Conclusions: The hidecan package is implemented in R and is publicly available on the CRAN repository and on GitHub. A description of the package, as well as a detailed tutorial, is made available alongside the package. It is also part of the VIEWpoly tool for the visualisation and exploration of results from polyploids computational tools. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. tigeR: Tumor immunotherapy gene expression data analysis R package.

Author: Chen, Yihao, He, Li‐Na, Zhang, Yuanzhe, Gong, Jingru, Xu, Shuangbin, Shu, Yuelong, Zhang, Di, Yu, Guangchuang, and Zuo, Zhixiang
Subjects: MACHINE learning, GENE expression, TUMOR microenvironment, PREDICTION models, SOURCE code
Abstract: Immunotherapy shows great promise for treating advanced cancers, but its effectiveness varies widely among different patients and cancer types. Identifying biomarkers and developing robust predictive models to discern which patients are most likely to benefit from immunotherapy is of great importance. In this context, we have developed the tumor immunotherapy gene expression R package (tigeR 1.0) to address the increasing need for effective tools to explore biomarkers and construct predictive models. tigeR encompasses four distinct yet closely interconnected modules. The Biomarker Evaluation module enables researchers to evaluate whether the biomarkers of interest are associated with immunotherapy response via built‐in or custom immunotherapy gene expression data. The Tumor Microenvironment Deconvolution module integrates 10 open‐source algorithms to obtain the proportions of different cell types within the tumor microenvironment, facilitating the investigation of the association between immune cell populations and immunotherapy response. The Prediction Model Construction module equips users with the ability to construct sophisticated prediction models using a range of built‐in machine‐learning algorithms. The Response Prediction module predicts the immunotherapy response for the patients from gene expression data using our pretrained machine learning models or public gene expression signatures. By providing these diverse functionalities, tigeR aims to simplify the process of analyzing immunotherapy gene expression data, thus making it accessible to researchers without advanced programming skills. The source code and example for the tigeR project can be accessed at http://github.com/YuLab-SMU/tigeR. Highlights: Tumor Immunotherapy Gene Expression R package (tigeR) is an effective R package to explore biomarkers and construct predictive models to predict immunotherapeutic outcomes.tigeR enables the flexibility to load built‐in or custom gene expression data with immunotherapy outcome information.tigeR encompasses four distinct yet closely interconnected modules, including the Biomarker Evaluation module, Tumor Microenvironment Deconvolution module, Prediction Model Construction module, and Response Prediction module. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. tidysdm: Leveraging the flexibility of tidymodels for species distribution modelling in R.

Author: Leonardi, Michela, Colucci, Margherita, Pozzi, Andrea Vittorio, Scerri, Eleanor M. L., and Manica, Andrea
Subjects: MACHINE learning, SPECIES distribution, PALEOBIOLOGY, BIOGEOGRAPHY, INTERFACE structures
Abstract: In species distribution modelling (SDM), it is common practice to explore multiple machine learning (ML) algorithms and combine their results into ensembles. In R, many implementations of different ML algorithms are available but, as they were mostly developed independently, they often use inconsistent syntax and data structures. For this reason, repeating an analysis with multiple algorithms and combining their results can be challenging.Specialised SDM packages solve this problem by providing a simpler, unified interface by wrapping the original functions to tackle each specific requirement. However, creating and maintaining such interfaces is time‐consuming, and with this approach, the user cannot easily integrate other methods that may become available.Here, we present tidysdm, an R package that solves this problem by taking advantage of the tidymodels universe. tidymodels provide standardised grammar, data structures and modelling interfaces, and a well‐documented infrastructure to integrate new algorithms and metrics. The wide adoption of tidymodels means that most ML algorithms and metrics are already integrated, and the user can add additional ones. Moreover, because of the broad adoption of tidymodels, new statistical approaches tend to be implemented quickly, making them easily integrated into existing pipelines and analyses.tidysdm takes advantage of the tidymodels universe to provide a flexible and fully customisable pipeline to fit SDM. It includes SDM‐specific algorithms and metrics, and methods to facilitate the use of spatial data within tidymodels.Additionally, tidysdm is the first software that natively allows SDM to be performed using data from different periods, expanding the availability of SDM for scholars working in palaeontology, archaeology, palaeobiology, palaeoecology and other disciplines focussing on the past. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. simona: a comprehensive R package for semantic similarity analysis on bio-ontologies.

Author: Gu, Zuguang
Subjects: KNOWLEDGE representation (Information theory), DATA structures, BIOLOGICAL systems, RESEARCH personnel, ONTOLOGY, ONTOLOGIES (Information retrieval)
Abstract: Background: Bio-ontologies are keys in structuring complex biological information for effective data integration and knowledge representation. Semantic similarity analysis on bio-ontologies quantitatively assesses the degree of similarity between biological concepts based on the semantics encoded in ontologies. It plays an important role in structured and meaningful interpretations and integration of complex data from multiple biological domains. Results: We present simona, a novel R package for semantic similarity analysis on general bio-ontologies. Simona implements infrastructures for ontology analysis by offering efficient data structures, fast ontology traversal methods, and elegant visualizations. Moreover, it provides a robust toolbox supporting over 70 methods for semantic similarity analysis. With simona, we conducted a benchmark against current semantic similarity methods. The results demonstrate methods are clustered based on their mathematical methodologies, thus guiding researchers in the selection of appropriate methods. Additionally, we explored annotation-based versus topology-based methods, revealing that semantic similarities solely based on ontology topology can efficiently reveal semantic similarity structures, facilitating analysis on less-studied organisms and other ontologies. Conclusions: Simona offers a versatile interface and efficient implementation for processing, visualization, and semantic similarity analysis on bio-ontologies. We believe that simona will serve as a robust tool for uncovering relationships and enhancing the interoperability of biological knowledge systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. simona: a comprehensive R package for semantic similarity analysis on bio-ontologies.

Author: Gu, Zuguang
Subjects: KNOWLEDGE representation (Information theory), DATA structures, BIOLOGICAL systems, RESEARCH personnel, ONTOLOGY, ONTOLOGIES (Information retrieval)
Abstract: Background: Bio-ontologies are keys in structuring complex biological information for effective data integration and knowledge representation. Semantic similarity analysis on bio-ontologies quantitatively assesses the degree of similarity between biological concepts based on the semantics encoded in ontologies. It plays an important role in structured and meaningful interpretations and integration of complex data from multiple biological domains. Results: We present simona, a novel R package for semantic similarity analysis on general bio-ontologies. Simona implements infrastructures for ontology analysis by offering efficient data structures, fast ontology traversal methods, and elegant visualizations. Moreover, it provides a robust toolbox supporting over 70 methods for semantic similarity analysis. With simona, we conducted a benchmark against current semantic similarity methods. The results demonstrate methods are clustered based on their mathematical methodologies, thus guiding researchers in the selection of appropriate methods. Additionally, we explored annotation-based versus topology-based methods, revealing that semantic similarities solely based on ontology topology can efficiently reveal semantic similarity structures, facilitating analysis on less-studied organisms and other ontologies. Conclusions: Simona offers a versatile interface and efficient implementation for processing, visualization, and semantic similarity analysis on bio-ontologies. We believe that simona will serve as a robust tool for uncovering relationships and enhancing the interoperability of biological knowledge systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. ggPlantmap: an open-source R package for the creation of informative and quantitative ggplot maps derived from plant images.

Author: Jo, Leonardo and Kajala, Kaisa
Subjects: BOTANISTS, BOTANY, GENE expression, PLANT communities, TRANSCRIPTOMES
Abstract: As plant research generates an ever-growing volume of spatial quantitative data, the need for decentralized and user-friendly visualization tools to explore large and complex datasets becomes crucial. Existing resources, such as the Plant eFP (electronic Fluorescent Pictograph) viewer, have played a pivotal role on the communication of gene expression data across many plant species. However, although widely used by the plant research community, the Plant eFP viewer lacks open and user-friendly tools for the creation of customized expression maps independently. Plant biologists with less coding experience can often encounter challenges when attempting to explore ways to communicate their own spatial quantitative data. We present 'ggPlantmap' an open-source R package designed to address this challenge by providing an easy and user-friendly method for the creation of ggplot representative maps from plant images. ggPlantmap is built in R, one of the most used languages in biology, to empower plant scientists to create and customize eFP-like viewers tailored to their experimental data. Here, we provide an overview of the package and tutorials that are accessible even to users with minimal R programming experience. We hope that ggPlantmap can assist the plant science community, fostering innovation, and improving our understanding of plant development and function. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. unmconf : an R package for Bayesian regression with unmeasured confounders.

Author: Hebdon, Ryan, Stamey, James, Kahle, David, and Zhang, Xiang
Subjects: BINOMIAL distribution, GAUSSIAN distribution, SENSITIVITY analysis, BAYESIAN analysis, SCIENTIFIC observation
Abstract: The inability to correctly account for unmeasured confounding can lead to bias in parameter estimates, invalid uncertainty assessments, and erroneous conclusions. Sensitivity analysis is an approach to investigate the impact of unmeasured confounding in observational studies. However, the adoption of this approach has been slow given the lack of accessible software. An extensive review of available R packages to account for unmeasured confounding list deterministic sensitivity analysis methods, but no R packages were listed for probabilistic sensitivity analysis. The R package unmconf implements the first available package for probabilistic sensitivity analysis through a Bayesian unmeasured confounding model. The package allows for normal, binary, Poisson, or gamma responses, accounting for one or two unmeasured confounders from the normal or binomial distribution. The goal of unmconf is to implement a user friendly package that performs Bayesian modeling in the presence of unmeasured confounders, with simple commands on the front end while performing more intensive computation on the back end. We investigate the applicability of this package through novel simulation studies. The results indicate that credible intervals will have near nominal coverage probability and smaller bias when modeling the unmeasured confounder(s) for varying levels of internal/external validation data across various combinations of response-unmeasured confounder distributional families. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. ToxDAR: A Workflow Software for Analyzing Toxicologically Relevant Proteomic and Transcriptomic Data, from Data Preparation to Toxicological Mechanism Elucidation.

Author: Jiang, Peng, Zhang, Zuzhen, Yu, Qing, Wang, Ze, Diao, Lihong, and Li, Dong
Subjects: THYROID hormone receptors, CHEMICAL reactions, APOPTOSIS, WORKFLOW software, KNOWLEDGE graphs
Abstract: Exploration of toxicological mechanisms is imperative for the assessment of potential adverse reactions to chemicals and pharmaceutical agents, the engineering of safer compounds, and the preservation of public health. It forms the foundation of drug development and disease treatment. High-throughput proteomics and transcriptomics can accurately capture the body's response to toxins and have become key tools for revealing complex toxicological mechanisms. Recently, a vast amount of omics data related to toxicological mechanisms have been accumulated. However, analyzing and utilizing these data remains a major challenge for researchers, especially as there is a lack of a knowledge-based analysis system to identify relevant biological pathways associated with toxicity from the data and to establish connections between omics data and existing toxicological knowledge. To address this, we have developed ToxDAR, a workflow-oriented R package for preprocessing and analyzing toxicological multi-omics data. ToxDAR integrates packages like NormExpression, DESeq2, and igraph, and utilizes R functions such as prcomp and phyper. It supports data preparation, quality control, differential expression analysis, functional analysis, and network analysis. ToxDAR's architecture also includes a knowledge graph with five major categories of mechanism-related biological entities and details fifteen types of interactions among them, providing comprehensive knowledge annotation for omics data analysis results. As a case study, we used ToxDAR to analyze a transcriptomic dataset on the toxicology of triphenyl phosphate (TPP). The results indicate that TPP may impair thyroid function by activating thyroid hormone receptor β (THRB), impacting pathways related to programmed cell death and inflammation. As a workflow-oriented data analysis tool, ToxDAR is expected to be crucial for understanding toxic mechanisms from omics data, discovering new therapeutic targets, and evaluating chemical safety. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. CleanUpRNAseq: An R/Bioconductor Package for Detecting and Correcting DNA Contamination in RNA-Seq Data.

Author: Liu, Haibo, Hu, Kai, O'Connor, Kevin, Kelliher, Michelle A., and Zhu, Lihua Julie
Subjects: GENE expression profiling, GENE expression, RNA sequencing, DATA integrity, DATA analysis
Abstract: RNA sequencing (RNA-seq) has become a standard method for profiling gene expression, yet genomic DNA (gDNA) contamination carried over to the sequencing library poses a significant challenge to data integrity. Detecting and correcting this contamination is vital for accurate downstream analyses. Particularly, when RNA samples are scarce and invaluable, it becomes essential not only to identify but also to correct gDNA contamination to maximize the data's utility. However, existing tools capable of correcting gDNA contamination are limited and lack thorough evaluation. To fill the gap, we developed CleanUpRNAseq, which offers a comprehensive set of functionalities for identifying and correcting gDNA-contaminated RNA-seq data. Our package offers three correction methods for unstranded RNA-seq data and a dedicated approach for stranded data. Through rigorous validation on published RNA-seq datasets with known levels of gDNA contamination and real-world RNA-seq data, we demonstrate CleanUpRNAseq's efficacy in detecting and correcting detrimental levels of gDNA contamination across diverse library protocols. CleanUpRNAseq thus serves as a valuable tool for post-alignment quality assessment of RNA-seq data and should be integrated into routine workflows for RNA-seq data analysis. Its incorporation into OneStopRNAseq should significantly bolster the accuracy of gene expression quantification and differential expression analysis of RNA-seq data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. KBAscope: key biodiversity area identification in R.

Author: Spiliopoulou, Konstantina, Rigal, François, Plumptre, Andrew J., Trigas, Panayiotis, Paragamian, Kaloust, Hochkirch, Axel, Lymberakis, Petros, Portolou, Danae, Stoumboudi, Maria Th., and Triantis, Kostas A.
Subjects: GRID cells, DATA editing, SUSTAINABLE development, CELL size, HABITATS
Abstract: Key Biodiversity Areas (KBAs) represent the largest global network of sites critical to the persistence of biodiversity, which have been identified against standardised quantitative criteria. Sites that hold very high biodiversity value or potential are given specific attention on site‐based conservation targets of the Kunming‐Montreal Global Biodiversity Framework (GBF), and KBAs are already used in indicators for the GBF and the Sustainable Development Goals. However, most of the species that trigger KBA status are birds and to maximise benefits for biodiversity under the actions taken to fulfil the GBF, countries need to update their KBAs to represent important sites across multiple taxa. Here we introduce KBAscope, an R package to identify potential KBAs using multiple taxonomic groups. KBAscope provides flexible, user‐friendly functions to edit species data (population, range maps, area of occupancy, area of habitat and localities); apply KBA criteria; and generate outputs to support the delineation and validation of KBAs. The details of the analysis – such as the spatial units tested or the KBA criteria applied – can be decided according to the scope of the analysis. We demonstrate the functionality of KBAscope by using it to identify potential KBAs in Greece based on multiple terrestrial taxonomic groups and four sizes of grid cells (4 km2, 25 km2, 100 km2, 225 km2). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. ARDL: An R Package for ARDL Models and Cointegration.

Author: Natsiopoulos, Kleanthis and Tzeremes, Nickolaos G.
Subjects: COINTEGRATION, LANGUAGE & languages
Abstract: This paper presents the ARDL package for the statistical language R, demonstrating its main functionalities in a step by step guide. Some of its main advantages over other related R packages are the intuitive API, and the fact that includes many important features missing from other packages that are essential for an in depth analysis. Additionally, it is designed in such a way that it can be combined with other packages for post regression diagnostics and tests. These characteristics are shown through an example, where we showcase part of the application demonstrated in the seminal work of Pesaran et al. (J Appl Econom 16:289–326, 2001). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Cross-National Analysis of Opioid Prescribing Patterns: Enhancements and Insights from the OralOpioids R Package in Canada and the United States.

Author: Banerjee, Ankona, Nobleza, Kenneth, Nguyen, Duc T., and Stricker, Erik
Subjects: OPIOID epidemic, OPIOID abuse, DRUG prescribing, DATABASES, RESEARCH personnel
Abstract: Background: The opioid crisis remains a significant public health challenge in North America, highlighted by the substantial need for tools to analyze and understand opioid potency and prescription patterns. Methods: The OralOpioids package automates the retrieval, processing, and analysis of opioid data from Health Canada's Drug Product Database (DPD) and the U.S. Food and Drug Administration's (FDA) National Drug Code (NDC) database. It includes functions such as load_Opioid_Table, which integrates country-specific data processing and Morphine Equivalent Dose (MED) calculations, providing a comprehensive dataset for analysis. The package facilitates a comprehensive examination of opioid prescriptions, allowing researchers to identify high-risk opioids and patterns that could inform policy and healthcare practices. Results: The integration of MED calculations with Canadian and U.S. data provides a robust tool for assessing opioid potency and prescribing practices. The OralOpioids R package is an essential tool for public health researchers, enabling a detailed analysis of North American opioid prescriptions. Conclusions: By providing easy access to opioid potency data and supporting cross-national studies, the package plays a critical role in addressing the opioid crisis. It suggests a model for similar tools that could be adapted for global use, enhancing our capacity to manage and mitigate opioid misuse effectively. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Bayesian generalized additive model selection including a fast variational option.

Author: He, Virginia X. and Wand, Matt P.
Abstract: We use Bayesian model selection paradigms, such as group least absolute shrinkage and selection operator priors, to facilitate generalized additive model selection. Our approach allows for the effects of continuous predictors to be categorized as either zero, linear or non-linear. Employment of carefully tailored auxiliary variables results in Gibbsian Markov chain Monte Carlo schemes for practical implementation of the approach. In addition, mean field variational algorithms with closed form updates are obtained. Whilst not as accurate, this fast variational option enhances scalability to very large data sets. A package in the R language aids use in practice. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. crossnma: An R package to synthesize cross-design evidence and cross-format data using network meta-analysis and network meta-regression.

Author: Hamza, Tasnim, Schwarzer, Guido, and Salanti, Georgia
Subjects: GIBBS sampling, CLINICAL trials, SCIENTIFIC observation, WORKFLOW, COHORT analysis
Abstract: Background: Although aggregate data (AD) from randomised clinical trials (RCTs) are used in the majority of network meta-analyses (NMAs), other study designs (e.g., cohort studies and other non-randomised studies, NRS) can be informative about relative treatment effects. The individual participant data (IPD) of the study, when available, are preferred to AD for adjusting for important participant characteristics and to better handle heterogeneity and inconsistency in the network. Results: We developed the R package crossnma to perform cross-format (IPD and AD) and cross-design (RCT and NRS) NMA and network meta-regression (NMR). The models are implemented as Bayesian three-level hierarchical models using Just Another Gibbs Sampler (JAGS) software within the R environment. The R package crossnma includes functions to automatically create the JAGS model, reformat the data (based on user input), assess convergence and summarize the results. We demonstrate the workflow within crossnma by using a network of six trials comparing four treatments. Conclusions: The R package crossnma enables the user to perform NMA and NMR with different data types in a Bayesian framework and facilitates the inclusion of all types of evidence recognising differences in risk of bias. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. Software symptomcheckR: an R package for analyzing and visualizing symptom checker triage performance.

Author: Kopka, Marvin and Feufel, Markus A.
Subjects: OPEN source software, STANDARDIZATION, PATIENT safety, RESOURCE allocation, ACCURACY
Abstract: Background: A major stream of research on symptom checkers aims at evaluating the technology's predictive accuracy, but apart from general trends, the results are marked by high variability. Several authors suggest that this variability might in part be due to different assessment methods and a lack of standardization. To improve the reliability of symptom checker evaluation studies, several approaches have been suggested, including standardizing input procedures, the generation of test vignettes, and the assignment of gold standard solutions for these vignettes. Recently, we suggested a third approach––test-theoretic metrics for standardized performance reporting–– to allow systematic and comprehensive comparisons of symptom checker performance. However, calculating these metrics is time-consuming and error prone, which could hamper the use and effectiveness of these metrics. Results: We developed the R package symptomcheckR as an open-source software to assist researchers in calculating standard metrics to evaluate symptom checker performance individually and comparatively and produce publication-ready figures. These metrics include accuracy (by triage level), safety of advice (i.e., rate of correctly or overtriaged cases), comprehensiveness (i.e., how many cases could be entered or were assessed), inclination to overtriage (i.e., how risk-averse a symptom checker is) and a capability comparison score (i.e., a score correcting for case difficulty and comprehensiveness that enables a fair and reliable comparison of different symptom checkers). Each metric can be obtained using a single command and visualized with another command. For the analysis of individual or the comparison of multiple symptom checkers, single commands can be used to produce a comprehensive performance profile that complements the standard focus on accuracy with additional metrics that reveal strengths and weaknesses of symptom checkers. Conclusions: Our package supports ongoing efforts to improve the quality of vignette-based symptom checker evaluation studies by means of standardized methods. Specifically, with our package, adhering to reporting standards and metrics becomes easier, simple, and time efficient. Ultimately, this may help users gain a more systematic understanding of the strengths and limitations of symptom checkers for different use cases (e.g., all-purpose symptom checkers for general medicine versus symptom checkers that aim at improving triage in emergency departments), which can improve patient safety and resource allocation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. SurvdigitizeR: an algorithm for automated survival curve digitization.

Author: Zhang, Jasper Zhongyuan, Rios, Juan David, Pechlivanoglou, Tilemanchos, Yang, Alan, Zhang, Qiyue, Deris, Dimitrios, Cromwell, Ian, and Pechlivanoglou, Petros
Subjects: OPTICAL character recognition, STANDARD deviations, DISTRIBUTION (Probability theory), R-curves, DIGITIZATION
Abstract: Background: Decision analytic models and meta-analyses often rely on survival probabilities that are digitized from published Kaplan–Meier (KM) curves. However, manually extracting these probabilities from KM curves is time-consuming, expensive, and error-prone. We developed an efficient and accurate algorithm that automates extraction of survival probabilities from KM curves. Methods: The automated digitization algorithm processes images from a JPG or PNG format, converts them in their hue, saturation, and lightness scale and uses optical character recognition to detect axis location and labels. It also uses a k-medoids clustering algorithm to separate multiple overlapping curves on the same figure. To validate performance, we generated survival plots form random time-to-event data from a sample size of 25, 50, 150, and 250, 1000 individuals split into 1,2, or 3 treatment arms. We assumed an exponential distribution and applied random censoring. We compared automated digitization and manual digitization performed by well-trained researchers. We calculated the root mean squared error (RMSE) at 100-time points for both methods. The algorithm's performance was also evaluated by Bland–Altman analysis for the agreement between automated and manual digitization on a real-world set of published KM curves. Results: The automated digitizer accurately identified survival probabilities over time in the simulated KM curves. The average RMSE for automated digitization was 0.012, while manual digitization had an average RMSE of 0.014. Its performance was negatively correlated with the number of curves in a figure and the presence of censoring markers. In real-world scenarios, automated digitization and manual digitization showed very close agreement. Conclusions: The algorithm streamlines the digitization process and requires minimal user input. It effectively digitized KM curves in simulated and real-world scenarios, demonstrating accuracy comparable to conventional manual digitization. The algorithm has been developed as an open-source R package and as a Shiny application and is available on GitHub: https://github.com/Pechli-Lab/SurvdigitizeR and https://pechlilab.shinyapps.io/SurvdigitizeR/. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. biblioverlap: an R package for document matching across bibliographic datasets.

Author: Vieira, Gabriel Alves and Leta, Jacqueline
Abstract: Bibliographic databases have long been a cornerstone of scientometrics research, and new information sources have prompted several comparative studies between them. Such studies often employ document-level matching procedures to identify overlaps in the corpus of each database and assess their coverage. However, despite being increasingly relevant in comparative studies, such a type of analysis still lacks an open-source tool to automate it. To fill this gap, we have developed an R package called biblioverlap, which implements a hybrid matching approach using a unique identifier and a selection of ubiquitous bibliographic fields to establish document co-occurrence. It supports data analysis from a broad range of secondary sources and can be used for comparing databases and assessing document overlap in virtually any bibliographic dataset, which can be insightful for various research questions. This paper presents the biblioverlap tool, details the matching procedure's implementation, and uses an example dataset containing records from the Federal University of Rio de Janeiro to illustrate the package's built-in functionality. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. GasanalyzeR: advancing reproducible research using a new R package for photosynthesis data workflows.

Author: Tholen, Danny
Subjects: CHLOROPHYLL spectra, CARBON isotopes, STABLE isotopes, REPRODUCIBLE research, RESEARCH personnel
Abstract: The analysis of photosynthetic traits has become an integral part of plant (eco-)physiology. Many of these characteristics are not directly measured, but calculated from combinations of several, more direct, measurements. The calculations of such derived variables are based on underlying physical models and may use additional constants or assumed values. Commercially available gas-exchange instruments typically report such derived variables, but the available implementations use different definitions and assumptions. Moreover, no software is currently available to allow a fully scripted and reproducible workflow that includes importing data, pre-processing and recalculating derived quantities. The R package gasanalyzer aims to address these issues by providing methods to import data from different instruments, by translating photosynthetic variables to a standardized nomenclature, and by optionally recalculating derived quantities using standardized equations. In addition, the package facilitates performing sensitivity analyses on variables or assumptions used in the calculations to allow researchers to better assess the robustness of the results. The use of the package and how to perform sensitivity analyses are demonstrated using three different examples. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Power Analysis of Exposure Mixture Studies Via Monte Carlo Simulations.

Author: Nguyen, Phuc H., Herring, Amy H., and Engel, Stephanie M.
Abstract: Estimating sample size and statistical power is an essential part of a good epidemiological study design. Closed-form formulas exist for simple hypothesis tests but not for advanced statistical methods designed for exposure mixture studies. Estimating power with Monte Carlo simulations is flexible and applicable to these methods. However, it is not straightforward to code a simulation for non-experienced programmers and is often hard for a researcher to manually specify multivariate associations among exposure mixtures to set up a simulation. To simplify this process, we present the R package mpower for power analysis of observational studies of environmental exposure mixtures involving recently developed mixtures analysis methods. The components within mpower are also versatile enough to accommodate any mixtures methods that will be developed in future. The package allows users to simulate realistic exposure data and mixed-typed covariates based on public dataset such as the National Health and Nutrition Examination Survey or other existing dataset from prior studies. Users can generate power curves to assess the trade-offs between sample size, effect size, and power of a design. This paper presents tutorials and examples of power analysis using mpower. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. A Strategy to Inform Athlete Sleep Support From Questionnaire Data and Its Application in an Elite Athlete Cohort.

Author: Suppiah, Haresh T., Gastin, Paul B., and Driller, Matthew W.
Subjects: SOCIAL support, CROSS-sectional method, SLEEP hygiene, MACHINE learning, PSYCHOSOCIAL factors, QUESTIONNAIRES, DESCRIPTIVE statistics, CLUSTER analysis (Statistics), DATA analysis software, ELITE athletes
Abstract: Purpose: Information from the Pittsburgh Sleep Quality Index (PSQI) and Athlete Sleep Behavior Questionnaire (ASBQ) provide the ability to identify the sleep disturbances experienced by athletes and their associated athlete-specific challenges that cause these disturbances. However, determining the appropriate support strategy to optimize the sleep habits and characteristics of large groups of athletes can be time-consuming and resource-intensive. The purpose of this study was to characterize the sleep profiles of elite athletes to optimize sleep-support strategies and present a novel R package, AthSlpBehaviouR, to aid practitioners with athlete sleep monitoring and support efforts. Methods: PSQI and ASBQ data were collected from a cohort of 412 elite athletes across 27 sports through an electronic survey. A k-means cluster analysis was employed to characterize the unique sleep-characteristic typologies based on PSQI and ASBQ component scores. Results: Three unique clusters were identified and qualitatively labeled based on the z scores of the PSQI components and ASBQ components: cluster 1, "high-priority; poor overall sleep characteristics + behavioral-focused support"; cluster 2, "medium-priority, sleep disturbances + routine/environment-focused support"; and cluster 3, "low-priority; acceptable sleep characteristics + general support." Conclusions: The findings of this study highlight the practical utility of an unsupervised learning approach to perform clustering on questionnaire data to inform athlete sleep-support recommendations. Practitioners can consider using the AthSlpBehaviouR package to adopt a similar approach in athlete sleep screening and support provision. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

25. PCAS: An Integrated Tool for Multi-Dimensional Cancer Research Utilizing Clinical Proteomic Tumor Analysis Consortium Data.

Author: Wang, Jin, Song, Xiangrong, Wei, Meidan, Qin, Lexin, Zhu, Qingyun, Wang, Shujie, Liang, Tingting, Hu, Wentao, Zhu, Xinyu, and Li, Jianxiang
Subjects: PROTEOMICS, CANCER research, MEDICAL research, DATA analysis, TRANSCRIPTOMES
Abstract: Proteomics offers a robust method for quantifying proteins and elucidating their roles in cellular functions, surpassing the insights provided by transcriptomics. The Clinical Proteomic Tumor Analysis Consortium database, enriched with comprehensive cancer proteomics data including phosphorylation and ubiquitination profiles, alongside transcriptomics data from the Genomic Data Commons, allow for integrative molecular studies of cancer. The ProteoCancer Analysis Suite (PCAS), our newly developed R package and Shinyapp, leverages these resources to facilitate in-depth analyses of proteomics, phosphoproteomics, and transcriptomics, enhancing our understanding of the tumor microenvironment through features like immune infiltration and drug sensitivity analysis. This tool aids in identifying critical signaling pathways and therapeutic targets, particularly through its detailed phosphoproteomic analysis. To demonstrate the functionality of the PCAS, we conducted an analysis of GAPDH across multiple cancer types, revealing a significant upregulation of protein levels, which is consistent with its important biological and clinical significance in tumors, as indicated in our prior research. Further experiments were used to validate the findings performed using the tool. In conclusion, the PCAS is a powerful and valuable tool for conducting comprehensive proteomic analyses, significantly enhancing our ability to uncover oncogenic mechanisms and identify potential therapeutic targets in cancer research. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Global trends and hotspots in pain associated with bipolar disorder in the last 20 years: a bibliometric analysis.

Author: Hong Qing Zhao, Mi Zhou, Jia Qi Jiang, Zhi Qiang Luo, and Yu Hong Wang
Subjects: BIBLIOMETRICS, BIPOLAR disorder, NEURALGIA, CHRONIC pain, DATABASES
Abstract: Purpose: The prevalence of comorbid pain and Bipolar Disorder in clinical practice continues to be high, with an increasing number of related publications. However, no study has used bibliometric methods to analyze the research progress and knowledge structure in this field. Our research is dedicated to systematically exploring the global trends and focal points in scientific research on pain comorbidity with bipolar disorder from 2003 to 2023, with the goal of contributing to the field. Methods: Relevant publications in this field were retrieved from the Web of Science core collection database (WOSSCC). And we used VOSviewer, CiteSpace, and the R package "Bibliometrix" for bibliometric analysis. Results: A total of 485 publications (including 360 articles and 125 reviews) from 66 countries, 1019 institutions, were included in this study. Univ Toront and Kings Coll London are the leading research institutions in this field. J Affect Disorders contributed the largest number of articles, and is the most co-cited journal. Of the 2,537 scholars who participated in the study, Stubbs B, Vancampfort D, and Abdin E had the largest number of articles. Stubbs B is the most co-cited author. "chronic pain," "neuropathic pain," "psychological pain" are the keywords in the research. Conclusion: This is the first bibliometric analysis of pain-related bipolar disorder. There is growing interest in the area of pain and comorbid bipolar disorder. Focusing on different types of pain in bipolar disorder and emphasizing pain management in bipolar disorder are research hotspots and future trends. The study of pain related bipolar disorder still has significant potential for development, and we look forward to more high-quality research in the future. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. MSProfileR: An Open-Source Software for Quality Control of Matrix-Assisted Laser Desorption Ionization–Time of Flight Spectra.

Author: Ben Hamouda, Refka, Estellon, Bertrand, Himet, Khalil, Cherif, Aimen, Marthinet, Hugo, Loreau, Jean-Marie, Texier, Gaëtan, Granjeaud, Samuel, and Almeras, Lionel
Subjects: COMPUTER software quality control, TANDEM mass spectrometry, QUALITY control, DESORPTION, WEB browsers
Abstract: In the early 2000s, matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) emerged as a performant and relevant tool for identifying micro-organisms. Since then, it has become practically essential for identifying bacteria in microbiological diagnostic laboratories. In the last decade, it was successfully applied for arthropod identification, allowing researchers to distinguish vectors from non-vectors of infectious diseases. However, identification failures are not rare, hampering its wide use. Failure is generally attributed either to the absence of respective counter species MS spectra in the database or to the insufficient quality of query MS spectra (i.e., lower intensity and diversity of MS peaks detected). To avoid matching errors due to non-compliant spectra, the development of a strategy for detecting and excluding outlier MS profiles became compulsory. To this end, we created MSProfileR, an R package leading to a bioinformatics tool through a simple installation, integrating a control quality system of MS spectra and an analysis pipeline including peak detection and MS spectra comparisons. MSProfileR can also add metadata concerning the sample that the spectra are derived from. MSProfileR has been developed in the R environment and offers a user-friendly web interface using the R Shiny framework. It is available on Microsoft Windows as a web browser application by simple navigation using the link of the package on Github v.3.10.0. MSProfileR is therefore accessible to non-computer specialists and is freely available to the scientific community. We evaluated MSProfileR using two datasets including exclusively MS spectra from arthropods. In addition to coherent sample classification, outlier MS spectra were detected in each dataset confirming the value of MSProfileR. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Mapping the research trends on political communication in Asia: A bibliometric analysis using R package and VOS.

Author: Saravanan, J, Thomas, Vineeth, and Ashikho, Aviini
Subjects: POLITICAL communication, BIBLIOMETRICS, POLITICAL attitudes, POLITICAL parties, PUBLIC officers, COUNTRIES
Abstract: Political communication refers to developing and exchanging political ideas and opinions among the general public, elected officials, political parties and affiliated organisations like the media. Recent years have seen an enormous amount of literature in the area of political communication owing to the growing interest of academics in the subject. Using the R package bibliometrix and the Visualisation of Similarities viewer programme, this study aims to enhance graphical mapping of the bibliographic data for political communication publications in select countries of Asia. The results show that, especially since 2016, scholars have been paying more and more attention to the study of political communication in the age of fake news, hyperpolarization, etc. They also show that research publications on the topics of communication, China, Taiwan, India, the USA, social media, articles, politics, the internet, decision-making, democracy, governance and elections are gaining momentum in recent years. Additionally, the findings show that the top three nations for publishing articles on political communication are the USA, China and Russia. The findings also reveal that even scholars from non-democratic or less democratic countries have made substantial attempts to improve political communication studies, despite the fact that political communication is one of the most crucial components in democratic countries. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. DEMRAT: AN R PACKAGE FOR PREDICTING GROWTH AND FERTILITY RATES IN SKELETAL SAMPLES USING AGE-AT-DEATH RATIOS.

Author: GALETA, PATRIK
Subjects: FERTILITY, POPULATION, DEATH rate, R (Computer program language), ALGORITHMS
Abstract: The growth and fertility rates of past populations can be estimated by analyzing the age-at-death distribution of skeletal samples. The procedure involves regressing growth or fertility rate on the age-at-death ratio, which is a proxy that captures the number of skeletons in two broad age-at-death categories (e.g., D5+/D20+). Galeta and Pankowská (2023, doi: 10.1371/journal.pone.0286580) recently developed a new prediction algorithm. They proposed to estimate growth and fertility rates using a unique prediction formula for each skeletal sample. Each formula is based on a unique reference set of simulated skeletal samples that match the size of the target real skeletal sample. The simulated skeletal samples are generated from populations with similar mortality levels to those assumed in the time period represented by the target skeletal sample. A correct setting of the sample size and the level of mortality increases the accuracy of the estimate. The approach, however, is computationally intensive because it involves generating many simulated reference skeletal samples. In this paper, we present the demrat package, written in the R programming language, which automates the simulation. The functions of the package provide a complete workflow from a real skeletal sample to the prediction of demographic rates. In addition, we offer a web application that allows non-R users to deploy predictions using the demrat package with a user-friendly, point-and-click graphical interface. Although the demrat package allows for estimating demographic rates for a single skeletal sample, we recommend predicting demographic rates in a larger set of skeletal samples and producing smoothed general demographic trends over large areas and/or long periods of time. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. CorrToolBox: an package for modeling correlational magnitude transformations in discretization contexts.

Author: Gao, R. and Demirtas, H.
Subjects: FEASIBILITY studies
Abstract: This article describes the R package CorrToolBox, which is designed for modeling the correlation transitions under specified distributional assumptions within the realm of discretization in the context of the latency and threshold concepts. The practical utility and functionality of the package are demonstrated by several illustrative examples. In addition, the package's feasibility and performance are evaluated via simulation studies using synthetic mixed data with a range of marginal distributions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. Facilitating an ecosystem approach through open data and information packaging.

Author: Duplisea, Daniel E, Roux, Marie-Julie, Plourde, Stéphane, Galbraith, Peter S, Blais, Marjolaine, Benoît, Hugues P, Sainte-Marie, Bernard, Lavoie, Diane, and Bourdages, Hugo
Subjects: FISHERY policy, GOVERNMENT laboratories, PACKAGING, MARINE ecology, ECOSYSTEMS, FISHERIES
Abstract: Open data that can be easily incorporated into analyses are essential for developing ecosystem approaches to marine ecological management: a common goal in fisheries policy in many countries. Although it is not always clear what constitutes an ecosystem approach, it always involves scientists working with a large variety of data and information, including data from physical and oceanographic sampling, multispecies surveys, and other sources describing human pressures. This can be problematic for analysts because these data, even when available, are often held in disparate datasets that do not necessarily correspond at appropriate temporal and spatial scales. Data can often only be obtained by specific requests to individuals in governmental agencies who are delivering on an increasing number of data requests as interest grows in practical ecosystem approach implementation. This data access model is not sustainable and hinders the momentum for ecosystem approach development. We describe a data bundling R package that makes data and climate projections available at appropriate scales to facilitate development of an ecosystem approach for the Gulf of St. Lawrence, Canada. This approach integrates closely with the present workflow of most government analysts, academics in fisheries, and scientists in private industry. The approach conforms with open data initiatives and makes data easily available globally while relieving some of the burden of data provision that can fall to some individuals in government laboratories. The structure and approach are generic, adaptable, and transferable to other regions and jurisdictions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. amstar2Vis: An R package for presenting the critical appraisal of systematic reviews based on the items of AMSTAR 2.

Author: Bougioukas, Konstantinos I., Karakasis, Paschalis, Pamporis, Konstantinos, Bouras, Emmanouil, and Haidich, Anna-Bettina
Subjects: COMPUTER software development
Abstract: Systematic reviews (SRs) have an important role in the healthcare decision-making practice. Assessing the overall confidence in the results of SRs using quality assessment tools, such as “A MeaSurement Tool to Assess Systematic Reviews 2” (AMSTAR 2), is crucial since not all SRs are conducted using the most rigorous methods. In this article, we introduce a free, open-source R package called “amstar2Vis” (https://github.com/bougioukas/amstar2Vis) that provides easy-to-use functions for presenting the critical appraisal of SRs, based on the items of AMSTAR 2 checklist. An illustrative example is outlined, describing the steps involved in creating a detailed table with the item ratings and the overall confidence ratings, generating a stacked bar plot that shows the distribution of ratings as percentages of SRs for each AMSTAR 2 item, and creating a “ggplot2” graph that shows the distribution of overall confidence ratings (“Critically Low,” “Low,” “Moderate,” or “High”). We expect “amstar2Vis” to be useful for overview authors and methodologists who assess the quality of SRs with AMSTAR 2 checklist and facilitate the production of pertinent publication-ready tables and figures. Future research and applications could further investigate the functionality or potential improvements of our package. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. tosr: Create the Tree of Science from WoS and Scopus.

Author: Robledo, Sebastian, Valencia, Luis, Zuluaga, Martha, Echeverri, Oscar Arbelaez, and Valencia, Jorge W. Arboleda
Subjects: SCIENTOMETRICS, CITATION analysis, DATABASES, METHODOLOGY
Abstract: The R package 'tosr' enables the construction of the Tree of Science (ToS), a metaphorical representation of scientific papers on a specific topic. The ToS's roots symbolize seminal works, the trunk stands for structural works, and the leaves depict the current literature. Traditionally, researchers have had to limit their ToS to data from a single database, such as Scopus or Web of Science (WoS). The 'tosr' package overcomes this limitation by allowing researchers to merge seed files from both Scopus and WoS, thereby facilitating a more comprehensive bibliometric analysis. This paper describes the development and application of the 'tosr' package, demonstrating its unique capabilities in creating a completer and more cohesive ToS and citation network for any scientific topic. By bridging the gap between these two major databases, 'tosr' offers researchers an unprecedented tool for scientometric research. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. 'rtry': An R package to support plant trait data preprocessing.

Author: Lam, Olee Hoi Ying, Kattge, Jens, Tautenhahn, Susanne, Boenisch, Gerhard, Kovach, Kyle R., and Townsend, Philip A.
Subjects: INFORMATION retrieval, DATABASES, FLEXIBLE structures, DATA release, FACTORY design & construction, DATA structures
Abstract: Plant trait data are used to quantify how plants respond to environmental factors and can act as indicators of ecosystem function. Measured trait values are influenced by genetics, trade‐offs, competition, environmental conditions, and phenology. These interacting effects on traits are poorly characterized across taxa, and for many traits, measurement protocols are not standardized. As a result, ancillary information about growth and measurement conditions can be highly variable, requiring a flexible data structure. In 2007, the TRY initiative was founded as an integrated database of plant trait data, including ancillary attributes relevant to understanding and interpreting the trait values. The TRY database now integrates around 700 original and collective datasets and has become a central resource of plant trait data. These data are provided in a generic long‐table format, where a unique identifier links different trait records and ancillary data measured on the same entity. Due to the high number of trait records, plant taxa, and types of traits and ancillary data released from the TRY database, data preprocessing is necessary but not straightforward. Here, we present the 'rtry' R package, specifically designed to support plant trait data exploration and filtering. By integrating a subset of existing R functions essential for preprocessing, 'rtry' avoids the need for users to navigate the extensive R ecosystem and provides the functions under a consistent syntax. 'rtry' is therefore easy to use even for beginners in R. Notably, 'rtry' does not support data retrieval or analysis; rather, it focuses on the preprocessing tasks to optimize data quality. While 'rtry' primarily targets TRY data, its utility extends to data from other sources, such as the National Ecological Observatory Network (NEON). The 'rtry' package is available on the Comprehensive R Archive Network (CRAN; https://cran.r‐project.org/package=rtry) and the GitHub Wiki (https://github.com/MPI‐BGC‐Functional‐Biogeography/rtry/wiki) along with comprehensive documentation and vignettes describing detailed data preprocessing workflows. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. hespdiv: an R package for spatially constrained, hierarchical and contiguous regionalization in palaeobiogeography.

Author: Daumantas, Liudas and Spiridonov, Andrej
Subjects: MIOCENE Epoch, BIOTIC communities
Abstract: The objective determination of boundaries of bioregions is a nontrivial problem. Here we present a new method family, HespDiv, and an algorithm for its implementation in a new R package: hespdiv. The hespdiv algorithm performs iterative hierarchical nonlinear spatial subdivisions of taxic data into topologically contiguous geographic regions. Bioregions obtained in this way can be viewed as realistic and causally important entities belonging to eco‐genealogical Bretskyan hierarchy. The possibilities of the hespdiv algorithm are successfully demonstrated on the Miocene mammal data from the contiguous United States, where a hierarchical set of bioregions was determined. The algorithm works in a decision‐tree‐like manner by generating multiple split‐lines, that each subdivide a study area and data into two parts per iteration per region. The performance of each split‐line is measured using a combination of data generalization and comparison functions that can be custom made or selected from pre‐set subdivision methods. In each iteration, the best split‐line is used to produce a subdivision, until no more adequate split‐lines are found. The process results in a hierarchy of subdivisions of distribution of taxonomic occurrences in space. Benefits of hespdiv include the delineation of spatially contiguous clusters, flexibility, and unique representations of spatial hierarchy tree. The package is shown to be effective in determining significant contiguous spatial structures of biota, so called geobiomes, in sufficiently sampled time intervals and regions. The flexibility and ease to use of the approach allows its application to the whole range of palaeo‐variables and environmental proxies. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. eDITH: An R‐package to spatially project eDNA‐based biodiversity across river networks with minimal prior information.

Author: Carraro, Luca and Altermatt, Florian
Subjects: ENVIRONMENTAL monitoring, ECOLOGICAL assessment, BIODIVERSITY, SPECIES distribution, BIOLOGISTS, FRESHWATER biodiversity
Abstract: Ecological and ecosystem monitoring is rapidly shifting towards using environmental DNA (eDNA) data, particularly in aquatic systems. This approach enables a combined coverage of biodiversity across all major organismal groups and the assessment of ecological indices. Yet, most current approaches are not exploiting the full potential of eDNA data, largely interpreting results in a localized perspective. In riverine networks, by explicitly modelling hydrological transport and associated DNA decay, hydrology‐based models enable upscaling eDNA‐based diversity information, providing spatially integrated inference. To capitalize on these unprecedented biodiversity data and translate it into space‐filling biodiversity projections, a streamlined implementation is needed.Here, we introduce the eDITH R‐package, implementing the eDITH model to project biodiversity across riverine networks with minimal prior information. eDITH couples a species distribution model relating a local taxon's eDNA shedding rate in streamwater to environmental covariates, a mass balance expressing the eDNA concentration at a river's cross‐section as a weighted sum of upstream contributions, and an observational model accounting for uncertainties in eDNA measurements. By leveraging on spatially replicated eDNA measurements and minimal hydro‐morphological data, eDITH enables disentangling the various upstream eDNA sources, and produces space‐filling maps of a taxon's spatial distribution at any chosen resolution. eDITH is applicable to both eDNA concentration and metabarcoding data, and to any taxon whose DNA can be retrieved in streamwater.The eDITH package provides user‐friendly functions for single‐run execution and fitting of eDITH to eDNA data with both Bayesian methods (via the BayesianTools package) and non‐linear optimization. An interface to the DHARMa package allows model validation via posterior predictive checks. Necessary preliminary steps such as watershed delineation and hydrological characterization are implemented via the rivnet package. We illustrate eDITH's workflow and functionalities with two case studies from published fish eDNA data.The eDITH package provides a user‐friendly implementation of eDITH, specifically intended for ecologists and conservation biologists. It can be used without previous modelling knowledge but also allows customization for experienced users. Ultimately, eDITH allows upscaling eDNA biodiversity data for any river globally, transforming how state and change in biodiversity in riverine systems can be tracked at high resolution in a highly versatile manner. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Inference of genomic landscapes using ordered Hidden Markov Models with emission densities (oHMMed).

Author: Vogl, Claus, Karapetiants, Mariia, Yıldırım, Burçin, Kjartansdóttir, Hrönn, Kosiol, Carolin, Bergman, Juraj, Majka, Michal, and Mikula, Lynette Caitlin
Subjects: HIDDEN Markov models, HUMAN chromosomes, CONTINUOUS distributions, MARKOV chain Monte Carlo, GENE expression, ZOOLOGICAL nomenclature
Abstract: Background: Genomes are inherently inhomogeneous, with features such as base composition, recombination, gene density, and gene expression varying along chromosomes. Evolutionary, biological, and biomedical analyses aim to quantify this variation, account for it during inference procedures, and ultimately determine the causal processes behind it. Since sequential observations along chromosomes are not independent, it is unsurprising that autocorrelation patterns have been observed e.g., in human base composition. In this article, we develop a class of Hidden Markov Models (HMMs) called oHMMed (ordered HMM with emission densities, the corresponding R package of the same name is available on CRAN): They identify the number of comparably homogeneous regions within autocorrelated observed sequences. These are modelled as discrete hidden states; the observed data points are realisations of continuous probability distributions with state-specific means that enable ordering of these distributions. The observed sequence is labelled according to the hidden states, permitting only neighbouring states that are also neighbours within the ordering of their associated distributions. The parameters that characterise these state-specific distributions are inferred. Results: We apply our oHMMed algorithms to the proportion of G and C bases (modelled as a mixture of normal distributions) and the number of genes (modelled as a mixture of poisson-gamma distributions) in windows along the human, mouse, and fruit fly genomes. This results in a partitioning of the genomes into regions by statistically distinguishable averages of these features, and in a characterisation of their continuous patterns of variation. In regard to the genomic G and C proportion, this latter result distinguishes oHMMed from segmentation algorithms based in isochore or compositional domain theory. We further use oHMMed to conduct a detailed analysis of variation of chromatin accessibility (ATAC-seq) and epigenetic markers H3K27ac and H3K27me3 (modelled as a mixture of poisson-gamma distributions) along the human chromosome 1 and their correlations. Conclusions: Our algorithms provide a biologically assumption free approach to characterising genomic landscapes shaped by continuous, autocorrelated patterns of variation. Despite this, the resulting genome segmentation enables extraction of compositionally distinct regions for further downstream analyses. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. Statistical rules for safety monitoring in clinical trials.

Author: Martens, Michael J and Logan, Brent R
Subjects: SICKLE cell anemia treatment, STATISTICAL models, PATIENT safety, DATA analysis, RESEARCH funding, CLINICAL trials, STATISTICS, CONCEPTUAL structures, DATA analysis software, BONE marrow transplantation, ADVERSE health care events, CHILDREN
Abstract: Background/Aims: Protecting patient safety is an essential component of the conduct of clinical trials. Rigorous safety monitoring schemes are implemented for these studies to guard against excess toxicity risk from study therapies. They often include protocol-specified stopping rules dictating that an excessive number of safety events will trigger a halt of the study. Statistical methods are useful for constructing rules that protect patients from exposure to excessive toxicity while also maintaining the chance of a false safety signal at a low level. Several statistical techniques have been proposed for this purpose, but the current literature lacks a rigorous comparison to determine which method may be best suitable for a given trial design. The aims of this article are (1) to describe a general framework for repeated monitoring of safety events in clinical trials; (2) to survey common statistical techniques for creating safety stopping criteria; and (3) to provide investigators with a software tool for constructing and assessing these stopping rules. Methods: The properties and operating characteristics of stopping rules produced by Pocock and O'Brien-Fleming tests, Bayesian Beta-Binomial models, and sequential probability ratio tests (SPRTs) are studied and compared for common scenarios that may arise in phase II and III trials. We developed the R package "stoppingrule" for constructing and evaluating stopping rules from these methods. Its usage is demonstrated through a redesign of a stopping rule for BMT CTN 0601 (registered at Clinicaltrials.gov as NCT00745420), a phase II, single-arm clinical trial that evaluated outcomes in pediatric sickle cell disease patients treated by bone marrow transplant. Results: Methods with aggressive stopping criteria early in the trial, such as the Pocock test and Bayesian Beta-Binomial models with weak priors, have permissive stopping criteria at late stages. This results in a trade-off where rules with aggressive early monitoring generally will have a smaller number of expected toxicities but also lower power than rules with more conservative early stopping, such as the O-Brien-Fleming test and Beta-Binomial models with strong priors. The modified SPRT method is sensitive to the choice of alternative toxicity rate. The maximized SPRT generally has a higher number of expected toxicities and/or worse power than other methods. Conclusions: Because the goal is to minimize the number of patients exposed to and experiencing toxicities from an unsafe therapy, we recommend using the Pocock or Beta-Binomial, weak prior methods for constructing safety stopping rules. At the design stage, the operating characteristics of candidate rules should be evaluated under various possible toxicity rates in order to guide the choice of rule(s) for a given trial; our R package facilitates this evaluation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. VedicDateTime: An R package to implement Vedic calendar system.

Author: Bokde, Neeraj Dhanraj, Patil, Prajwal Kailasnath, Sengupta, Saradindu, Sawant, Manisha, and Feijóo-Lorenzo, Andrés E.
Abstract: Calendar systems adopted across the world are either solar, lunar or lunisolar, based on the movements of the sun, the moon, or both. The Gregorian (solar) calendars are considered as time references for modern computations. However, Vedic calendars, being lunisolar, can be more effective for the analysis of activities that depend on both celestial bodies. In this paper, we present VedicDateTime, an open-source framework that implements the Vedic calendar and provides conversions for Gregorian dates. Along with package details, we also provide two case studies that make use of the proposed package for time-series analysis. The objective of this paper is to motivate researchers to explore the potential of the Vedic calendar from the perspective of time series analysis. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. SWAT + input data preparation in a scripted workflow: SWATprepR.

Author: Plunge, Svajunas, Szabó, Brigitta, Strauch, Michael, Čerkasova, Natalja, Schürz, Christoph, and Piniewski, Mikołaj
Subjects: QUALITY control, ATMOSPHERIC deposition, CROP rotation, QUALITY assurance, ELECTRONIC data processing, WORKFLOW
Abstract: Input data collection, quality assurance and preparation are central but time_consuming steps in environmental modeling. Errors due to manual processing of model input data can result in an incorrect representation of an environmental system and may consequently lead to implausible model simulations. Correct input data preparation and thorough quality check at an early stage of the model setup procedure are essential to build confidence in model simulation results. Typically, in environmental model applications, many steps in the input data preparation phase have to be repeated with the inflow of new, additional or corrected data. In this study, we selected the widely used SWAT + ecohydrological model as an illustrative example to investigate challenges related to input data preparation. To assist in these tasks, we developed an R package named SWATprepR, which provides functions for typical and repeating SWAT + model input data preparation tasks. The package supports the preparation of weather input files, atmospheric deposition, soil parameters, crop rotations, and observed (control or calibration) data, to name a few, presently with focus on European applications. The SWATprepR functions are integrated in R script workflows and can help SWAT + modelers to avoid repetitive tasks, secure reproducibility and transparently document the data processing steps. Application of the package is illustrated with a test case of a SWAT + model for a small catchment in central Poland. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. metamedian: An R package for meta‐analyzing studies reporting medians.

Author: McGrath, Sean, Zhao, XiaoFei, Ozturk, Omer, Katzenschlager, Stephan, Steele, Russell, and Benedetti, Andrea
Subjects: MEDIAN (Mathematics), RESEARCH personnel
Abstract: When performing an aggregate data meta‐analysis of a continuous outcome, researchers often come across primary studies that report the sample median of the outcome. However, standard meta‐analytic methods typically cannot be directly applied in this setting. In recent years, there has been substantial development in statistical methods to incorporate primary studies reporting sample medians in meta‐analysis, yet there are currently no comprehensive software tools implementing these methods. In this paper, we present the metamedian R package, a freely available and open‐source software tool for meta‐analyzing primary studies that report sample medians. We summarize the main features of the software and illustrate its application through real data examples involving risk factors for a severe course of COVID‐19. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. A conceptual framework for host‐associated microbiomes of hybrid organisms.

Author: Camper, Benjamin T., Laughlin, Zachary, Malagon, Daniel, Denton, Robert, and Bewick, Sharon
Subjects: DATA visualization, STRUCTURAL frames, CONCEPTUAL models, HYBRID systems, GUT microbiome, DATA analysis
Abstract: Hybridization between organisms from evolutionarily distinct lineages can have profound consequences on organismal ecology, with cascading effects on fitness and evolution. Most studies of hybrid organisms have focused on organismal traits, for example, various aspects of morphology and physiology. However, with the recent emergence of holobiont theory, there has been growing interest in understanding how hybridization impacts and is impacted by host‐associated microbiomes. Better understanding of the interplay between host hybridization and host‐associated microbiomes has the potential to provide insight into both the roles of host‐associated microbiomes as dictators of host performance as well as the fundamental rules governing host‐associated microbiome assembly. Unfortunately, there is a current lack of frameworks for understanding the structure of host‐associated microbiomes of hybrid organisms.In this paper, we develop four conceptual models describing possible relationships between the host‐associated microbiomes of hybrids and their progenitor or 'parent' taxa. We then integrate these models into a quantitative '4H index' and present a new R package for calculation, visualization and analysis of this index.We demonstrate how the 4H index can be used to compare hybrid microbiomes across disparate plant and animal systems. Our analyses of these data sets show variation in the 4H index across systems based on host taxonomy, host site and microbial taxonomic group.Our four conceptual models, paired with our 4H index and associated visualization tools, facilitate comparison across hybrid systems. This, in turn, allows for systematic exploration of how different aspects of host hybridization impact the host‐associated microbiomes of hybrid organisms. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. epidecodeR: a functional exploration tool for epigenetic and epitranscriptomic regulation.

Author: Joshi, Kandarp and Wang, Dan O
Subjects: LIFE sciences, CUMULATIVE distribution function, RNA modification & restriction, GENE expression, ANIMAL welfare
Abstract: Recent technological advances in sequencing DNA and RNA modifications using high-throughput platforms have generated vast epigenomic and epitranscriptomic datasets whose power in transforming life science is yet fully unleashed. Currently available in silico methods have facilitated the identification, positioning and quantitative comparisons of individual modification sites. However, the essential challenge to link specific 'epi-marks' to gene expression in the particular context of cellular and biological processes is unmet. To fast-track exploration, we generated epidecodeR implemented in R, which allows biologists to quickly survey whether an epigenomic or epitranscriptomic status of their interest potentially influences gene expression responses. The evaluation is based on the cumulative distribution function and the statistical significance in differential expression of genes grouped by the number of 'epi-marks'. This tool proves useful in predicting the role of H3K9ac and H3K27ac in associated gene expression after knocking down deacetylases FAM60A and SDS3 and N6-methyl-adenosine-associated gene expression after knocking out the reader proteins. We further used epidecodeR to explore the effectiveness of demethylase FTO inhibitors and histone-associated modifications in drug abuse in animals. epidecodeR is available for downloading as an R package at https://bioconductor.riken.jp/packages/3.13/bioc/html/epidecodeR.html. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. A review of knowledge management research in the past three decades: a bibliometric analysis.

Author: Farooq, Rayees
Subjects: BIBLIOMETRICS, KNOWLEDGE management, THEMATIC maps, INFORMATION resources management, TECHNOLOGY management
Abstract: Purpose: This study aims to conduct a bibliometric analysis on knowledge management from journals in the Scopus database between 1988 and 2021. The paper covered the past three decades of publications and carried out performance analysis and science mapping analysis of articles. Design/methodology/approach: The study uses bibliometrics, performance analysis and science mapping analysis of 1,016 articles extracted from the Scopus database. The study examined the scientific productivity of articles, productive authors, citable documents, most relevant institutions, cited countries, co-occurrence of keywords, thematic mapping, co-citations and collaboration of authors and countries. The study used Biblioshiny as a tool to carry out the performance analysis and science mapping analysis. Findings: The results show that the number of publications has significantly increased in the past decade, 88.4% of authors contribute at least a single article, 8.3% of authors published two articles, 2% of the authors published three documents and 0.6% of the authors contribute four papers. The USA, China and Australia were the most productive countries in terms of the total number of citations and foreign collaborations. Journal of Knowledge Management, Knowledge Management Research and Practice, VINE Journal of Information and Knowledge Management and International Journal of Technology Management are the top outlets in the knowledge management literature. Originality/value: Over the past decade, the research on knowledge management construct has exploded because of the growing interest of researchers and practitioners in the field. Despite being a well-developed field, few studies have applied bibliometric analysis in the knowledge management literature. The study is more comprehensive in terms of the actors and methods involved in analyzing the scientific production of articles in the area of knowledge management. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. LCMS: An R package for automated semitargeted analysis in lipidomics.

Author: Peltier, Caroline, Vasku, Glenda, Crépin, Marine, Cabaret, Stephanie, and Berdeaux, Olivier
Subjects: LIPIDOMICS, LIPID analysis, RF values (Chromatography)
Abstract: While nontargeted analysis aims to profile and report the relative distributions of a wide range of molecules from different lipid classes/subclasses, its major challenge is the annotation and identification of the molecules. Semitargeted analysis circumvents the problem by establishing a (potentially large) list of molecules to be targeted in the samples that are identified before the analysis. This approach is particularly adapted for lipid analysis to help with the automation of lipid annotation and identification. However, the manual extraction of peaks for many molecules and many samples is time consuming. Consequently, an automation of these extractions is deeply required. This paper presents a free R package for the automation of semitargeted analysis for lipid analysis. From raw files collected with LC‐MS device and a list of molecules to target (containing their class), it automatically returns Excel files containing the intensities for each targeted molecule and each sample. This package allows a fast computation of the intensities. Furthermore, it guarantees the reproducibility of the results and is freely available and user‐friendly. Practical Applications: With the help of the R package presented in this paper, the use of semitargeted lipidomics as an alternative to untargeted analysis should be investigated by more labs. Work on the comparisons between the approaches could be conducted. While untargeted methods are mostly used, they require long pretreatments and identification of molecules of interest. On the contrary, in semitargeted analysis, once the integration table and retention time are obtained, the results are fast and directly interpretable. An idea for lipidomics would be to use untargeted lipidomics to compute the integration table and retention table, then use semitargeted analysis for a fast computation of well identified molecules. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. materialmodifier: An R package of photo editing effects for material perception research.

Author: Tsuda, Hiroyuki and Kawabata, Hideaki
Subjects: PHOTOGRAPHIC editing, BEHAVIORAL assessment, SOURCE code, PSYCHOLOGICAL research, SURFACE properties
Abstract: In this paper, we introduce an R package that performs automated photo editing effects. Specifically, it is an R implementation of an image-processing algorithm proposed by Boyadzhiev et al. (2015). The software allows the user to manipulate the appearance of objects in photographs, such as emphasizing facial blemishes and wrinkles, smoothing the skin, or enhancing the gloss of fruit. It provides a reproducible method to quantitatively control specific surface properties of objects (e.g., gloss and roughness), which is useful for researchers interested in topics related to material perception, from basic mechanisms of perception to the aesthetic evaluation of faces and objects. We describe the functionality, usage, and algorithm of the method, report on the findings of a behavioral evaluation experiment, and discuss its usefulness and limitations for psychological research. The package can be installed via CRAN, and documentation and source code are available at https://github.com/tsuda16k/materialmodifier. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R.

Author: Aslett, Louis J. M. and Christ, Ryan R.
Subjects: HIDDEN Markov models, GENEALOGY, GENOME-wide association studies, PROGRAMMING languages
Abstract: Background: Approximating the recent phylogeny of N phased haplotypes at a set of variants along the genome is a core problem in modern population genomics and central to performing genome-wide screens for association, selection, introgression, and other signals. The Li & Stephens (LS) model provides a simple yet powerful hidden Markov model for inferring the recent ancestry at a given variant, represented as an N × N distance matrix based on posterior decodings. Results: We provide a high-performance engine to make these posterior decodings readily accessible with minimal pre-processing via an easy to use package kalis, in the statistical programming language R. kalis enables investigators to rapidly resolve the ancestry at loci of interest and developers to build a range of variant-specific ancestral inference pipelines on top. kalis exploits both multi-core parallelism and modern CPU vector instruction sets to enable scaling to hundreds of thousands of genomes. Conclusions: The resulting distance matrices accessible via kalis enable local ancestry, selection, and association studies in modern large scale genomic datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. IxPopDyMod: an R package to write, run, and analyze tick population and infection dynamics models.

Author: Stokowski, Myles and Allen, David
Subjects: POPULATION dynamics, IXODES scapularis, TICKS, BORRELIA burgdorferi, TICK-borne diseases, TICK infestations
Abstract: Given the increasing prevalence of tick-borne diseases, such as Lyme disease, modeling the population and infection dynamics of tick vectors is an important public health tool. These models have applications for testing the effects of control methods or climate change on tick populations. There is an established history of tick population models, but code for them is rarely shared, especially not in a convenient format for others to modify and use. We present an R package, called IxPopDyMod, intended to function as a flexible and consistent framework for reproducible Ixodidae (hard-bodied ticks) population dynamics models. Here we focus on two key parts of the package: a function to create valid model configurations and a function to run a configured model and return the daily population over time. We provide three examples in appendices: one reproducing an existing Ixodes scapularis population model, one providing a novel Dermacentor albipictus model, and one showing Borrelia burgdorferi infection in ticks. Together these examples show the flexibility of the package to model scenarios of interest to tick researches. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. Neutrality in plant–herbivore interactions.

Author: Pan, Vincent S. and Wetzel, William C.
Abstract: Understanding the distribution of herbivore damage among leaves and individual plants is a central goal of plant–herbivore biology. Commonly observed unequal patterns of herbivore damage have conventionally been attributed to the heterogeneity in plant quality or herbivore behaviour or distribution. Meanwhile, the potential role of stochastic processes in structuring plant–herbivore interactions has been overlooked. Here, we show that based on simple first principle expectations from metabolic theory, random sampling of different sizes of herbivores from a regional pool is sufficient to explain patterns of variation in herbivore damage. This is despite making the neutral assumption that herbivory is caused by randomly feeding herbivores on identical and passive plants. We then compared its predictions against 765 datasets of herbivory on 496 species across 116° of latitude from the Herbivory Variability Network. Using only one free parameter, the estimated attack rate, our neutral model approximates the observed frequency distribution of herbivore damage among plants and especially among leaves very well. Our results suggest that neutral stochastic processes play a large and underappreciated role in natural variation in herbivory and may explain the low predictability of herbivory patterns. We argue that such prominence warrants its consideration as a powerful force in plant–herbivore interactions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. MCPtaggR: R package for accurate genotype calling in reduced representation sequencing data by eliminating error-prone markers based on genome comparison.

Author: Furuta, Tomoyuki and Yamamoto, Toshio
Abstract: Reduced representation sequencing (RRS) offers cost-effective, high-throughput genotyping platforms such as genotyping-by-sequencing (GBS). RRS reads are typically mapped onto a reference genome. However, mapping reads harbouring mismatches against the reference can potentially result in mismapping and biased mapping, leading to the detection of error-prone markers that provide incorrect genotype information. We established a genotype-calling pipeline named mappable collinear polymorphic tag genotyping (MCPtagg) to achieve accurate genotyping by eliminating error-prone markers. MCPtagg was designed for the RRS-based genotyping of a population derived from a biparental cross. The MCPtagg pipeline filters out error-prone markers prior to genotype calling based on marker collinearity information obtained by comparing the genome sequences of the parents of a population to be genotyped. A performance evaluation on real GBS data from a rice F2 population confirmed its effectiveness. Furthermore, our performance test using a genome assembly that was obtained by genome sequence polishing on an available genome assembly suggests that our pipeline performs well with converted genomes, rather than necessitating de novo assembly. This demonstrates its flexibility and scalability. The R package, MCPtaggR, was developed to provide functions for the pipeline and is available at https://github.com/tomoyukif/MCPtaggR. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

380 results on '"R PACKAGE"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources