Descriptor: "R" / Database: Complementary Index - Searchworks@Jio Institute Digital Library Search Results

1. SimpleMetaPipeline: Breaking the bioinformatics bottleneck in metabarcoding.

Author: Williams, Jake, Pettorelli, Nathalie, Dowell, Rosalie, Macdonald, Kenneth, Meyer, Christopher, Steyaert, Margaux, Tweedt, Sarah, and Ransome, Emma
Abstract: The democratisation of next‐generation sequencing has vastly increased the availability of sequencing data from metabarcoding. However, to effectively prepare these metabarcoding data for subsequent analysis, researchers must consistently apply several different bioinformatic tools—including those which denoise reads, cluster sequences and assign taxonomic identities. This often creates a bioinformatics bottleneck in workflows for non‐specialists due to obstacles around: (a) integrating different tools, (b) the inability to easily modify and rerun bioinformatic pipelines involving non‐scripted ('point‐and‐click') elements and (c) the multiple outputs that may be required of a single dataset (e.g. amplicon sequence variants [ASVs] and operational taxonomic units [OTUs]), which often results in users running pipelines multiple times.Here, we introduce SimpleMetaPipeline, an open‐source bioinformatics pipeline implemented in R, which addresses these obstacles. SimpleMetaPipeline integrates the most robust and commonly used existing bioinformatic tools in a single reproducible pipeline, with a streamlined choice of parameters, to generate a sequence data table containing alternative clustering and assignment options. SimpleMetaPipeline accepts demultiplexed paired‐end and single reads from multiple sequencing runs.We describe the pipeline and demonstrate how alternative annotations enable the easy implementation of multi‐algorithm agreement tests to strengthen inferences.SimpleMetaPipeline represents a valuable addition to the existing library of pipelines, providing easy and reproducible bioinformatics, including a range of commonly desired clustering and assignment options, such as OTUs and ASVs. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. repDilPCR: a tool for automated analysis of qPCR assays by the dilution-replicate method.

Author: Yosifov, Deyan Yordanov, Reichenzeller, Michaela, Stilgenbauer, Stephan, and Mertens, Daniel
Subjects: POLYMERASE chain reaction, STATISTICAL software, SOURCE code, TELEOLOGY, TWO-way analysis of variance, INTERNET servers
Abstract: Background: The dilution-replicate experimental design for qPCR assays is especially efficient. It is based on multiple linear regression of multiple 3-point standard curves that are derived from the experimental samples themselves and thus obviates the need for a separate standard curve produced by serial dilution of a standard. The method minimizes the total number of reactions and guarantees that Cq values are within the linear dynamic range of the dilution-replicate standard curves. However, the lack of specialized software has so far precluded the widespread use of the dilution-replicate approach. Results: Here we present repDilPCR, the first tool that utilizes the dilution-replicate method and extends it by adding the possibility to use multiple reference genes. repDilPCR offers extensive statistical and graphical functions that can also be used with preprocessed data (relative expression values) obtained by usual assay designs and evaluation methods. repDilPCR has been designed with the philosophy to automate and speed up data analysis (typically less than a minute from Cq values to publication-ready plots), and features automatic selection and performance of appropriate statistical tests, at least in the case of one-factor experimental designs. Nevertheless, the program also allows users to export intermediate data and perform more sophisticated analyses with external statistical software, e.g. if two-way ANOVA is necessary. Conclusions: repDilPCR is a user-friendly tool that can contribute to more efficient planning of qPCR experiments and their robust analysis. A public web server is freely accessible at https://repdilpcr.eu without registration. The program can also be used as an R script or as a locally installed Shiny app, which can be downloaded from https://github.com/deyanyosifov/repDilPCR where also the source code is available. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. Trends and Future Directions in Analysing Attractiveness of Geoparks Using an Automated Merging Method of Multiple Databases—R-Based Bibliometric Analysis.

Author: Nyulas, Judith, Dezsi, Ștefan, Niță, Adrian, Toma, Raluca-Andreea, and Lazăr, Ana-Maria
Subjects: MUD volcanoes, CONCEPT mapping, GEOPARKS, CREATIVE ability in science, DATABASES, CITATION indexes, BIBLIOMETRICS
Abstract: Since their creation, geoparks have been among the fastest growing natural environments. Their attractiveness is one of the most important factors for the success of this natural destination. Despite their importance, a bibliometric analysis on geopark attractiveness is missing from the studied databases. The aim of this paper is to synthesise a heterogeneous body of knowledge of geoparks in terms of attractiveness, highlighting the evolution and breadth of the research field. To achieve this, the following objectives were set: (a) to adopt a method based on functions provided by the bibliometrix package to automatically combine databases, namely WoS, Scopus, PubMed and Dimensions, detailing the method used and (b) to analyse the bibliometric indicators in order to identify the trends in the literature and the possible directions for future research. The applied methodology was based on bibliometric analysis using R for non-coders. From the 707 documents retrieved, the validation process resulted in 349 eligible documents published between 2002 and 2024, on which the analysis was carried out. The current study elaborated a method and examined the key information on the topic trends, which were given by production performance, productivity trends, spatial analysis and abstract approach analysis. Additionally, strategic mapping of the conceptual context was performed. Thus, the result provides a description of the automatic method with practical applications. As discerned from the three-dimensional analysis (spatial, temporal and size), the emerging research directions within scientific creativity encompassed (1) forms of tourism practiced in geoparks, especially focused on ecotourism and volcanic tourism; (2) geomorphological features, mineral springs and mud volcanoes; (3) aesthetic aspects, scenic sites and mining heritage; and (4) methodology, data analysis and modelling methods across different regions and countries. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Identifying disputants' attitudinal variations in family mediations: A data mining approach.

Author: Xu, Qingxin
Subjects: FAMILY mediation, DATA mining, EVALUATION, DATA analysis, CRITICAL discourse analysis
Abstract: This article combines linguistic analysis and data mining methods to explore variations in speakers' evaluative meaning-making in conflict talks. It focuses on conflict style construction through evaluative language, specifically how disputants advance attitudes. The corpus consists of 230 minutes of family mediation talks involving 12 divorcing spouses. The research draws from the Appraisal framework to analyse evaluative meaning-making at a discourse semantics level, capturing both explicit and implicit attitudes, as well as the scaling and dialogic framing of attitudes. Data exploration uses clustering algorithms via RStudio to identify variations in disputants' discursive behaviour. The findings uncover three conflict styles based on disputants' preference for attitude advancement formulations, with varying degrees of assertiveness and forcefulness. This study's contributions include a holistic treatment of evaluative meaning-making, the marriage of digital tools to nuanced linguistic annotation, and a novel interpretation for conflict style. The findings offer fresh insights into disputants' discursive self-presentation in confrontational exchanges. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Software application profile: tpc and micd—R packages for causal discovery with incomplete cohort data.

Author: Andrews, Ryan M, Bang, Christine W, Didelez, Vanessa, Witte, Janine, and Foraita, Ronja
Subjects: MISSING data (Statistics), STATISTICAL errors, APPLICATION software, SOURCE code, SCALING (Social sciences)
Abstract: Motivation The Peter Clark (PC) algorithm is a popular causal discovery method to learn causal graphs in a data-driven way. Until recently, existing PC algorithm implementations in R had important limitations regarding missing values, temporal structure or mixed measurement scales (categorical/continuous), which are all common features of cohort data. The new R packages presented here, micd and tpc , fill these gaps. Implementation micd and tpc packages are R packages. General features The micd package provides add-on functionality for dealing with missing values to the existing pcalg R package, including methods for multiple imputations relying on the Missing At Random assumption. Also, micd allows for mixed measurement scales assuming conditional Gaussianity. The tpc package efficiently exploits temporal information in a way that results in a more informative output that is less prone to statistical errors. Availability The tpc and micd packages are freely available on the Comprehensive R Archive Network (CRAN). Their source code is also available on GitHub (https://github.com/bips-hb/micd ; https://github.com/bips-hb/tpc). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Places as brands: charting the value of place-based intangibles.

Author: Castaldi, Carolina and Mendonça, Sandro
Subjects: PLACE marketing, INTANGIBLE property, REGIONAL development
Abstract: What happens when actors and sectors incorporate references to specific territories in their development efforts? Throughout history, producers have capitalised individually and collectively on place associations through mechanisms such as product marketing and cultural framing in order to protect investments and promote internationalisation efforts. This contribution explores the agenda-setting power of the concept of 'place-based intangibles': assets with qualities of intangibles but intricately tied to specific geographical commons. After defining the concept, we overview the key substantive topics, innovative approaches and policy puzzles in three thematic areas of research: (1) regional strategies to build place-based intangibles; (2) place-based intangibles and regional development; and (3) place-based intangibles and corporate strategies. We conclude by discussing the prospects set forth by the assemblage of fresh papers included in this special issue, which this paper also introduces, and put forward the implications of place-based intangibles for policymakers, businesses and scholars. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Personality and place as resources for regional development: Alfred Nobel's Karlskoga.

Author: Pugh, Rhiannon and Andersson, Ida
Subjects: REGIONAL development, PLACE marketing, SOCIAL structure
Abstract: In 'Alfred Nobel's Karlskoga', Sweden, the municipality has placed its most famous former resident at the heart of its economic development strategy. Through an in-depth qualitative case study, we examine the tensions and complexities surrounding this process and fill an existing research gap around personality-based place branding for regional development purposes. The findings suggest that even with a world-famous figure as talisman, personality-based place branding is a complex endeavour where old rivalries, tightknit social structures and economic dependencies makes us question – is it even possible to build a brand that is both inclusive and truly representational of a place? [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. OrgaMapper: a robust and easy-to-use workflow for analyzing organelle positioning.

Author: Schmied, Christopher, Ebner, Michael, Samsó, Paula, Van Der Veen, Rozemarijn, Haucke, Volker, and Lehmann, Martin
Subjects: CELL morphology, CELL size, CELL nuclei, INTRACELLULAR space, ORGANELLES
Abstract: Background: Eukaryotic cells are highly compartmentalized by a variety of organelles that carry out specific cellular processes. The position of these organelles within the cell is elaborately regulated and vital for their function. For instance, the position of lysosomes relative to the nucleus controls their degradative capacity and is altered in pathophysiological conditions. The molecular components orchestrating the precise localization of organelles remain incompletely understood. A confounding factor in these studies is the fact that organelle positioning is surprisingly non-trivial to address e.g., perturbations that affect the localization of organelles often lead to secondary phenotypes such as changes in cell or organelle size. These phenotypes could potentially mask effects or lead to the identification of false positive hits. To uncover and test potential molecular components at scale, accurate and easy-to-use analysis tools are required that allow robust measurements of organelle positioning. Results: Here, we present an analysis workflow for the faithful, robust, and quantitative analysis of organelle positioning phenotypes. Our workflow consists of an easy-to-use Fiji plugin and an R Shiny App. These tools enable users without background in image or data analysis to (1) segment single cells and nuclei and to detect organelles, (2) to measure cell size and the distance between detected organelles and the nucleus, (3) to measure intensities in the organelle channel plus one additional channel, (4) to measure radial intensity profiles of organellar markers, and (5) to plot the results in informative graphs. Using simulated data and immunofluorescent images of cells in which the function of known factors for lysosome positioning has been perturbed, we show that the workflow is robust against common problems for the accurate assessment of organelle positioning such as changes of cell shape and size, organelle size and background. Conclusions: OrgaMapper is a versatile, robust, and easy-to-use automated image analysis workflow that can be utilized in microscopy-based hypothesis testing and screens. It effectively allows for the mapping of the intracellular space and enables the discovery of novel regulators of organelle positioning. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Employee Ratings and Reviews Data from Glassdoor.

Author: Zhou, Mi, Li, Yaxuan, Qiao, Zhilei, and Shi, Bowen
Subjects: EMPLOYEE reviews, RESEARCH personnel, COVID-19, ANGLES
Abstract: This paper presents the employee ratings and reviews data from Glassdoor and the R codes used to collect, clean, and organize the data. We collect three types of information for each Glassdoor review: review metrics, content, and reviewer information. We also calculate some commonly used textual metrics, such as sentiment, readability, the number of uncertainty words, etc. The datasets include necessary identifiers that can connect to other financial data sources. All the variables and metrics are provided at the review level, which enables researchers to aggregate the data from different levels and angles. As a demonstrative example, we use a simple word list to measure how often employees mention COVID-19 in review comments. The R codes provided as an RStudio project are self-contained and can also be modified and applied to other data sources of interest. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. ARDL: An R Package for ARDL Models and Cointegration.

Author: Natsiopoulos, Kleanthis and Tzeremes, Nickolaos G.
Subjects: COINTEGRATION, LANGUAGE & languages
Abstract: This paper presents the ARDL package for the statistical language R, demonstrating its main functionalities in a step by step guide. Some of its main advantages over other related R packages are the intuitive API, and the fact that includes many important features missing from other packages that are essential for an in depth analysis. Additionally, it is designed in such a way that it can be combined with other packages for post regression diagnostics and tests. These characteristics are shown through an example, where we showcase part of the application demonstrated in the seminal work of Pesaran et al. (J Appl Econom 16:289–326, 2001). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Spatial macroeconomics.

Author: Bond-Smith, Steven, Corrado, Luisa, Felsenstein, Daniel, and Elhorst, Paul
Abstract: This special issue on spatial macroeconomics aims to bridge the divide between spatial and macroeconomics. Defined in the introduction, spatial macroeconomics explores the interactions between economic activity and geographical space. The issue comprises eleven papers authored by a total of 32 researchers. These papers were selected through a combination of solicited submissions and an open call for contributions. Four papers within this special issue delve into spatial macroeconomic theory. They cover topics such as agglomeration economies for innovation, a neoclassical spatial general equilibrium growth model, the spatial sorting of heterogeneous workers and the impact of national industrial policies in strategic industries on trade. Additionally, seven papers offer empirical studies that encompass a wide range of methodologies. These include general equilibrium models, input-output-based analyses and econometric models. The empirical research addresses various topics, such as the impact of trade on productivity, the trade-off between efficiency and equity, fiscal assistance, local and nationwide fiscal multipliers, forced human displacement during wars and the spatial diffusion effects of renewable energy resource deployment. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Multilevel Semiparametric Latent Variable Modeling in R with "galamm".

Author: Sørensen, Øystein
Subjects: STRUCTURAL equation modeling, ITEM response theory, AUTOMATIC differentiation, LATENT variables, SPARSE matrices
Abstract: We present the R package galamm, whose goal is to provide common ground between structural equation modeling and mixed effect models. It supports estimation of models with an arbitrary number of crossed or nested random effects, smoothing splines, mixed response types, factor structures, heteroscedastic residuals, and data missing at random. Implementation using sparse matrix methods and automatic differentiation ensures computational efficiency. We here briefly present the implemented methodology, give an overview of the package and an example demonstrating its use. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Climate change impacts assessment on precipitation within and around an urbanizing city under shared socioeconomic pathways.

Author: Rentachintala, Lakshmi Raghu Nagendra Prasad, Reddy, M G Muni, and Mohapatra, Pranab Kumar
Subjects: PRECIPITATION variability, CLIMATE change, TREND analysis, TIME series analysis
Abstract: In the current study, the impacts of climate change on precipitation to the Amaravati city of Andhra Pradesh, India, are assessed. Trends and variability of precipitation changes for the historical period 1951–2014 and various Shared Socioeconomic Pathways (SSP), SSP scenarios of the 2015–2100 period are determined by considering daily precipitation time series from bias-corrected climate projected data of precipitation from CMIP6 GCM, ACCESS CM2. Mann–Kendall (M–K) test and Sen's slope estimator are used to perform trend and variability analysis of precipitation attributed to climate change. No magnitude of trend is found for precipitation both for observed and under SSP scenarios. However, there are increasing and decreasing trends observed season-wise and annual precipitation datasets for both observed and under SSP scenarios within and around the study area. Also, the urban land proportion projections indicate that the total area becomes urbanized even at the end of the year 2060. The findings of this study may assist in predicting the impacts of climate change on precipitation to a city. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. Transforming environmental health datasets from the comparative toxicogenomics database into chord diagrams to visualize molecular mechanisms.

Author: Wyatt, Brent, Davis, Allan Peter, Wiegers, Thomas C., Wiegers, Jolene, Abrar, Sakib, Sciaky, Daniela, Barkalow, Fern, Strong, Melissa, and Mattingly, Carolyn J.
Subjects: ENVIRONMENTAL databases, ENVIRONMENTAL health, DATABASES, TOXICOGENOMICS, DATA visualization
Abstract: In environmental health, the specific molecular mechanisms connecting a chemical exposure to an adverse endpoint are often unknown, reflecting knowledge gaps. At the public Comparative Toxicogenomics Database (CTD; https://ctdbase.org/), we integrate manually curated, literature-based interactions from CTD to compute four-unit blocks of information organized as a potential step-wise molecular mechanism, known as "CGPD-tetramers," wherein a chemical interacts with a gene product to trigger a phenotype which can be linked to a disease. These computationally derived datasets can be used to fill the gaps and offer testable mechanistic information. Users can generate CGPD-tetramers for any combination of chemical, gene, phenotype, and/or disease of interest at CTD; however, such queries typically result in the generation of thousands of CGPD-tetramers. Here, we describe a novel approach to transform these large datasets into user-friendly chord diagrams using R. This visualization process is straightforward, simple to implement, and accessible to inexperienced users that have never used R before. Combining CGPD-tetramers into a single chord diagram helps identify potential key chemicals, genes, phenotypes, and diseases. This visualization allows users to more readily analyze computational datasets that can fill the exposure knowledge gaps in the environmental health continuum. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Can intellectual property protection promote domestic value added in exports? Evidence from mainland China.

Author: Huang, Junpei, Lin, Shanlang, Zhang, Yiqiao, Wang, Ning, and Wang, Zhenyu
Subjects: INTELLECTUAL property, MARKET power, PRICES, EXPORTS, DATABASES
Abstract: Based on a unified framework, this study explores the influence of regional intellectual property protection (IPP) on domestic value added in exports. A theoretical model is constructed to illustrate that IPP affects the types and prices of intermediate goods through effects of market power, product category, and crowding-out, influences the choice of domestic and imported raw materials, and finally affects the domestic added value in exports of enterprises. Empirical evidence on the influence of IPP on the domestic added value in exports is provided utilizing China Customs Statistics, the Chinese Industrial Enterprise Database, and Chinese prefectural-level data from 2000 to 2007. The results show that IPP significantly promotes the domestic value added of exports in China, but the effect is heterogeneous in terms of region and mode of enterprise trade; in the eastern region and processing trade enterprises the IPP effect is more pronounced. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Computational social science in regional analysis and the European real estate market.

Author: Gabrielli, Lorenzo, Sulis, Patrizia, Fontana, Matteo, Signorelli, Serena, Vespe, Michele, and Lavalle, Carlo
Subjects: SOCIAL sciences, REAL property, MONETARY unions, EMPIRICAL research
Abstract: The recent so-called 'data revolution' offers unprecedented opportunities to innovate regional policies. New data sources are being widely used by the scientific community, however their uptake is far from being systematic in the policy cycle, where data innovation can improve territorial impact assessment. This paper presents a survey on the use of non-traditional data in the context of regional policy, together with a case study on real estate markets of three European countries, highlighting the perspectives and limitations of computational social science in regional analysis in terms of data quality and availability. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Modeling item-level heterogeneous treatment effects: A tutorial with the glmer function from the lme4 package in R.

Author: Gilbert, Joshua B.
Abstract: Recent advancements in education scholarship have introduced Item Response Theory (IRT) models to address treatment heterogeneity at the assessment item level. These models for item-level heterogeneous treatment effects (IL-HTE) enable detailed analyses of treatments that may have varying impacts on individual items within an assessment. This article offers a comprehensive tutorial for applied researchers interested in implementing IL-HTE analysis in R, utilizing the lme4 package. Using empirical data from a second-grade reading comprehension assessment as a running example, this tutorial emphasizes model-building strategies, interpretation techniques, visualization methods, and extensions. By following this tutorial, researchers will gain practical insights into utilizing IL-HTE analysis for enhanced understanding and interpretation of treatment effects at the item level. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Trends and Directions in Oats Research under Drought and Salt Stresses: A Bibliometric Analysis (1993–2023).

Author: Huang, Haiyan, Wang, Xiangtao, Li, Junqin, Gao, Yang, Yang, Yuting, Wang, Rui, Zhou, Zijun, Wang, Puchang, and Zhang, Yujun
Subjects: SCIENTIFIC literature, AGRICULTURAL colleges, FIELD crops, BIBLIOMETRICS, CLIMATE change, OATS
Abstract: With global climate change leading to increasing intensity and frequency of droughts, as well as the growing problem of soil salinization, these factors significantly affect crop growth, yield, and resilience to adversity. Oats are a cereal widely grown in temperate regions and are rich in nutritive value; however, the scientific literature on the response of oat to drought and salt stress has not yet been analyzed in detail. This study comprehensively analyzed the response of oat to drought stress and salt stress using data from the Web of Science core database and bibliometric methods with R (version4.3.1), VOSviewer (version 1.6.19), and Citespace (version6.3.1.0) software. The number of publications shows an increasing trend in drought stress and salt stress in oat over the past 30 years. In the field of drought-stress research, China, the United States, and Canada lead in terms of literature publication, with the most academic achievements being from China Agricultural University and Canadian Agricultural Food University. The journal with the highest number of published papers is Field Crops Research. Oat research primarily focuses on growth, yield, physiological and biochemical responses, and strategies for improving drought resistance. Screening of drought-tolerant genotypes and transformation of drought-tolerant genes may be key directions for future oat drought research. In the field of salt-stress research, contributions from China, the United States, and India stand out, with the Chinese Academy of Agricultural Sciences and Inner Mongolia Agricultural University producing the most significant research results. The largest number of published articles has been found in the Physiologia Plantarum journal. Current oat salt-stress research primarily covers growth, physiological and biochemical responses, and salt-tolerance mechanisms. It is expected that future oat salt research will focus more on physiological and biochemical responses, as well as gene-editing techniques. Despite achievements under single-stress conditions, combined drought and salt-stress effects on oat remain understudied, necessitating future research on their interaction at various biological levels. The purpose of this study is to provide potential theoretical directions for oat research on drought and salt stress. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. occTest: An integrated approach for quality control of species occurrence data.

Author: Serra‐Diaz, Josep M., Borderieux, Jeremy, Maitner, Brian, Boonman, Coline C. F., Park, Daniel, Guo, Wen‐Yong, Callebaut, Arnaud, Enquist, Brian J., Svenning, Jens‐C., and Merow, Cory
Subjects: DIGITIZATION, QUALITY control, DATA scrubbing, OUTLIER detection, SPECIES, TEST interpretation
Abstract: Aim: Species occurrence data are valuable information that enables one to estimate geographical distributions, characterize niches and their evolution, and guide spatial conservation planning. Rapid increases in species occurrence data stem from increasing digitization and aggregation efforts, and citizen science initiatives. However, persistent quality issues in occurrence data can impact the accuracy of scientific findings, underscoring the importance of filtering erroneous occurrence records in biodiversity analyses. Innovation: We introduce an R package, occTest, that synthesizes a growing open‐source ecosystem of biodiversity cleaning workflows to prepare occurrence data for different modelling applications. It offers a structured set of algorithms to identify potential problems with species occurrence records by employing a hierarchical organization of multiple tests. The workflow has a hierarchical structure organized in testPhases (i.e. cleaning vs. testing) that encompass different testBlocks grouping different testTypes (e.g. environmental outlier detection), which may use different testMethods (e.g. Rosner test, jacknife,etc.). Four different testBlocks characterize potential problems in geographic, environmental, human influence and temporal dimensions. Filtering and plotting functions are incorporated to facilitate the interpretation of tests. We provide examples with different data sources, with default and user‐defined parameters. Compared to other available tools and workflows, occTest offers a comprehensive suite of integrated tests, and allows multiple methods associated with each test to explore consensus among data cleaning methods. It uniquely incorporates both coordinate accuracy analysis and environmental analysis of occurrence records. Furthermore, it provides a hierarchical structure to incorporate future tests yet to be developed. Main conclusions: occTest will help users understand the quality and quantity of data available before the start of data analysis, while also enabling users to filter data using either predefined rules or custom‐built rules. As a result, occTest can better assess each record's appropriateness for its intended application. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Selective editing for asymmetry analysis in intra-EU trade Micro-Data Exchange (MDE).

Author: Bruno, Mauro, Causo, Maria Serena, Massacci, Giulio, Ortame, Francesco, Ruocco, Giuseppina, and Toti, Simona
Subjects: STATISTICS, STANDARDIZATION, SUSPICION, IMPORTS, EDITING
Abstract: Since January 2022, the Regulation on European Business Statistics (EU 2019/2152) requires EU Member States to compulsorily share microdata on intra-EU exports. Establishing intra-EU export Micro-Data Exchange (MDE) provides National Statistical Institutes with a new data source to compile intra-EU import statistics. The availability of MDE tackles two key challenges: diminishing the overall response burden on data providers and meeting user expectations regarding the quality of the produced statistics. However, transitioning to a data production system based on MDE data requires the assessment of the coherence and comparability between MDE and National import data. To identify asymmetries between the two data sources, Istat developed an innovative application designed to foster cooperation among Member States. The tool was developed using the Shiny package in R. The implemented solution allows users to perform exploratory analysis, systematic error detection, and selective editing. The most relevant asymmetries are identified through relative contribution and the asymmetry suspicion indices assessed by user-defined thresholds. Sharing the open tool within the European Statistical System enhances interoperability, promotes method harmonization, and encourages the adoption of official statistical standards. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Twinning of the Egongyan Bridge.

Author: Chen, Xiaohu, Qi, Yong, and Tang, Man-Chung
Subjects: CITY traffic, SUSPENSION bridges, LONG-span bridges, BRIDGES, PEDESTRIANS
Abstract: The First Egongyan Bridge in Chongqing, China was opened to traffic on December 27, 2000. It carries six-lanes of city traffic and two pedestrian paths, one on each side. It is a 600 m span suspension bridge. It was planned that the two pedestrian paths might eventually be converted to transit tracks in the future. However, with the rapid increase in urban traffic, the deck was transformed into an 8-lane bridge instead. Therefore, a new bridge parallel and adjacent to the old bridge was built for the rail transit. For aesthetic reasons, the city decided that the new bridge should also be a suspension bridge, with the same span length and tower height as the first bridge. The First Egongyan Bridge was a true suspension bridge with main cables anchored to ground anchors. However, the new bridge bridge's proximity to the first bridge necessitated the construction of a self-anchored suspension bridge that did away with a second set of ground anchors. This 600 m span happened to be the world's longest span for a self-anchored suspension bridge. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Using R for Multivariate Meta-analysis on Educational Psychology Data: A Method Study.

Author: Savatsomboon, Gamon, Ruannakarn, Prasert, Yurayat, Phamornpun, Chanprasitchai, Ong-art, and Leihaothabam, Jibon Kumar Sharma
Subjects: EDUCATIONAL psychology, PUBLICATIONS, DATA analysis, META-analysis, MULTIVARIATE analysis
Abstract: Using R to conduct univariate meta-analyses is becoming common for publication. However, R can also conduct multivariate meta-analysis (MMA). However, newcomers to both R and MMA may find using R to conduct MMA daunting. Given that, R may not be easy for those unfamiliar with coding. Likewise, MMA is a topic of advanced statistics. Thus, it may be very challenging for most newcomers to conduct MMA using R. If this holds, this can be viewed as a practice gap. In other words, the practice gap is that researchers are not capable of using R to conduct MMA in practice. This is problematic. This paper alleviates this practice gap by illustrating how to use R (the metaSEM package) to conduct MMA on educational psychology data. Here, the metaSEM package is used to obtain the required MMA text outputs. However, the metaSEM package is not capable of producing the other required graphical outputs. As a result, the meta for package is also used as a complimentary to generate the required graphical outputs. Ultimately, we hope that our audience will be able to apply what they learn from this method paper to conduct MMA using R in their teaching, research, and publication. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Bibliometric Analysis of Academic Studies on Particleboard.

Author: BERAM, Abdullah
Subjects: BIBLIOMETRICS, PARTICLE board, FOREST products, SCIENCE databases, WEB databases
Abstract: Copyright of Düzce University Journal of Forestry / Düzce Üniversitesi Orman Fakültesi Ormancılık Dergisi is the property of Duzce University and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

24. Raising the bar in spatial economic analysis: two laws of spatial economic modelling.

Author: Elhorst, J. Paul
Subjects: ECONOMIC models, REAL estate economics, RETROSPECTIVE studies
Abstract: Based on a prospective and retrospective analysis of the 'Raising the bar' editorials preceding regular issues of this journal from 2016–2023, this paper identifies two laws of spatial economic modelling: (i) Units of observation cannot be treated as independent entities because they interact and (ii) interaction causes spillovers from one unit to another. Common themes and trends uncovered in these editorials, the progress that has been made and the direction in which the research fields can continue to develop are discussed, using both laws as references. Currently, there is no single approach that adequately addresses both laws. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. 'RISDM': species distribution modelling from multiple data sources in R.

Author: Foster, Scott D., Peel, David, Hosack, Geoffrey R., Hoskins, Andrew, Mitchell, David J., Proft, Kirstin, Yang, Wen‐Hsi, Uribe‐Rivera, David E, and Froese, Jens G.
Subjects: SPECIES distribution, POISSON processes, POINT processes
Abstract: Species distribution models (SDMs) are usually based on a single data type, such as presence‐only (PO), presence‐absence (PA) or abundance (AA). Results from SDMs using single sources of data will suffer from inherent biases and limitations to that data type. For example, PO data contain sampling‐bias and PA/AA data are often less expansive and more sparse. Integrated SDMs (ISDMs) combine multiple data types and have recently emerged as a way to leverage strengths and minimise weaknesses of the different data types. They pose a common (distribution) model and separate observation models for each of the data types. The 'RISDM' package for the R environment (www.r‐project.org) provides access to this modelling framework using functions for preparation, fitting, interpreting and diagnosing models. The functionality of the package is demonstrated here using synthetic data sets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Stacked probability plots of the extended illness-death model using constant transition hazards – an easy to use shiny app.

Author: Grodd, Marlon, Weber, Susanne, and Wolkewitz, Martin
Subjects: MATHEMATICAL forms, NOSOCOMIAL infections, PROBABILITY theory, DATA structures, HAZARDS
Abstract: Background: Extended illness-death models (a specific class of multistate models) are a useful tool to analyse situations like hospital-acquired infections, ventilation-associated pneumonia, and transfers between hospitals. The main components of these models are hazard rates and transition probabilities. Calculation of different measures and their interpretation can be challenging due to their complexity. Methods: By assuming time-constant hazards, the complexity of these models becomes manageable and closed mathematical forms for transition probabilities can be derived. Using these forms, we created a tool in R to visualize transition probabilities via stacked probability plots. Results: In this article, we present this tool and give some insights into its theoretical background. Using published examples, we give guidelines on how this tool can be used. Our goal is to provide an instrument that helps obtain a deeper understanding of a complex multistate setting. Conclusion: While multistate models (in particular extended illness-death models), can be highly complex, this tool can be used in studies to both understand assumptions, which have been made during planning and as a first step in analysing complex data structures. An online version of this tool can be found at https://eidm.imbi.uni-freiburg.de/. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Unlocking the genetic tapestry of autoimmune diseases: Unveiling common genes across multiple conditions.

Author: Ghosh, Soujanya, Mohanty, Rupali, Santra, Arunava, Saha, Anisha, Agrawal, Anubha, Shrivastava, Sharmishtha, Roy, Chandrashish, Mazumder, Ishanee, Das, Debarup, and Mahmood, Syed Haaris
Subjects: AUTOIMMUNE diseases, TYPE 1 diabetes, INFLAMMATORY bowel diseases, SYSTEMIC lupus erythematosus, SJOGREN'S syndrome, GENE expression profiling
Abstract: Objectives: This study aimed to unravel the complexities of autoimmune diseases by conducting a comprehensive analysis of gene expression data across 10 conditions, including systemic lupus erythematosus (SLE), psoriasis, Sjögren's syndrome, sclerosis, immune‐associated diseases, osteoarthritis, cystic fibrosis, inflammatory bowel disease (IBD), type 1 diabetes, and Guillain–Barré syndrome. Methods: Gene expression profiles were rigorously examined to identify both upregulated and downregulated genes specific to each autoimmune disease. The study employed visual representation techniques such as heatmaps, volcano plots, and contour‐MA plots to provide an intuitive understanding of the complex gene expression patterns in these conditions. Results: Distinct gene expression profiles for each autoimmune condition were uncovered, with psoriasis and osteoarthritis standing out due to a multitude of both upregulated and downregulated genes, indicating intricate molecular interplays in these disorders. Notably, common upregulated and downregulated genes were identified across various autoimmune conditions, with genes like SELENBP1, MMP9, BNC1, and COL1A1 emerging as pivotal players. Conclusion: This research contributes valuable insights into the molecular signatures of autoimmune diseases, highlighting the unique gene expression patterns characterizing each condition. The identification of common genes shared among different autoimmune conditions, and their potential role in mitigating the risk of rare diseases in patients with more prevalent conditions, underscores the growing significance of genetics in healthcare and the promising future of personalized medicine. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Incorporating Data Analytics in Management Accounting: A Teaching Case on Cost Estimation.

Author: Hesford, James W., Pizzini, Mina, and Turner, Michael J.
Subjects: MANAGERIAL accounting, COST functions, VARIABLE costs, OVERHEAD costs, REGRESSION analysis
Abstract: Management accounting tools are based on the idea that total costs are composed of fixed and variable components. Textbooks usually teach five methods to identify fixed and variables costs, including two that are outdated (scatterplot and high-low). Account analysis and industrial engineering are sometimes useful, but regression is best for its objectivity. Despite its importance, few cases address cost estimation using regression. This case requires students to use regression analysis to estimate a cost function from 108 monthly observations of unit-level data from a hotel chain. Although the case is intended for use with R or Python, instructors can also use Excel. Students learn how to address important data and model specification issues, including outliers, autocorrelation, and inflation. Students are excited to acquire relevant tools and skills they can apply to future work projects. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Enzyme Kinetics Analysis: An online tool for analyzing enzyme initial rate data and teaching enzyme kinetics.

Author: Mak, Daniel A., Dunn, Sebastian, Coombes, David, Carere, Carlo R., Allison, Jane R., Nock, Volker, Hudson, André O., and Dobson, Renwick C. J.
Subjects: ENZYME kinetics, CHEMICAL processes, FREEWARE (Computer software), CYTOLOGY, RESEARCH personnel, MOLECULAR biology, LIVING polymerization
Abstract: Enzymes are nature's catalysts, mediating chemical processes in living systems. The study of enzyme function and mechanism includes defining the maximum catalytic rate and affinity for substrate/s (among other factors), referred to as enzyme kinetics. Enzyme kinetics is a staple of biochemistry curricula and other disciplines, from molecular and cellular biology to pharmacology. However, because enzyme kinetics involves concepts rarely employed in other areas of biology, it can be challenging for students and researchers. Traditional graphical analysis was replaced by computational analysis, requiring another skill not core to many life sciences curricula. Computational analysis can be time‐consuming and difficult in free software (e.g., R) or require costly software (e.g., GraphPad Prism). We present Enzyme Kinetics Analysis (EKA), a web‐tool to augment teaching and learning and streamline EKA. EKA is an interactive and free tool for analyzing enzyme kinetic data and improving student learning through simulation, built using R and RStudio's ShinyApps. EKA provides kinetic models (Michaelis–Menten, Hill, simple reversible inhibition models, ternary‐complex, and ping‐pong) for users to fit experimental data, providing graphical results and statistics. Additionally, EKA enables users to input parameters and create data and graphs, to visualize changes to parameters (e.g., KM or number of measurements). This function is designed for students learning kinetics but also for researchers to design experiments. EKA (enzyme-kinetics.shinyapps.io/enzkinet_webpage/) provides a simple, interactive interface for teachers, students, and researchers to explore enzyme kinetics. It gives researchers the ability to design experiments and analyze data without specific software requirements. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. An Interface to Monitor Process Variability Using the Binomial ATTRIVAR SS Control Chart.

Author: Violante, João Pedro Costa, Machado, Marcela A. G., Mendes, Amanda dos Santos, and Almeida, Túlio S.
Subjects: QUALITY control charts, STATISTICAL process control, MANUFACTURING processes, QUALITY control
Abstract: Control charts are tools of paramount importance in statistical process control. They are broadly applied in monitoring processes and improving quality, as they allow the detection of special causes of variation with a significant level of accuracy. Furthermore, there are several strategies able to be employed in different contexts, all of which offer their own advantages. Therefore, this study focuses on monitoring the variability in univariate processes through variance using the Binomial version of the ATTRIVAR Same Sample S2 (B-ATTRIVAR SS S2) control chart, given that it allows coupling attribute and variable inspections (ATTRIVAR means attribute + variable), i.e., taking advantage of the cost-effectiveness of the former and the wealth of information and greater performance of the latter. Its Binomial version was used for such a purpose, since inspections are made using two attributes, and the Same Sample was used due to being submitted to both the attribute and variable stages of inspection. A computational application was developed in the R language using the Shiny package so as to create an interface to facilitate its application and use in the quality control of the production processes. Its application enables users to input process parameters and generate the B-ATTRIVAR SS control chart for monitoring the process variability with variance. By comparing the data obtained from its application with a simpler code, its performance was validated, given that its results exhibited striking similarity. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. NeuroDecodeR: a package for neural decoding in R.

Author: Meyers, Ethan M.
Subjects: RESEARCH personnel, MODULAR design, PACKAGING design, DATA analysis, FASHION design, NEUROSCIENCES
Abstract: Neural decoding is a powerful method to analyze neural activity. However, the code needed to run a decoding analysis can be complex, which can present a barrier to using the method. In this paper we introduce a package that makes it easy to perform decoding analyses in the R programing language. We describe how the package is designed in a modular fashion which allows researchers to easily implement a range of different analyses. We also discuss how to format data to be able to use the package, and we give two examples of how to use the package to analyze real data. We believe that this package, combined with the rich data analysis ecosystem in R, will make it significantly easier for researchers to create reproducible decoding analyses, which should help increase the pace of neuroscience discoveries. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. OmicNavigator: open-source software for the exploration, visualization, and archival of omic studies.

Author: Ernst, Terrence R., Blischak, John D., Nordlund, Paul, Dalen, Joe, Moore, Justin, Bhamidipati, Akshay, Dwivedi, Pankaj, LoGrasso, Joe, Curado, Marco Rocha, and Engelmann, Brett Warren
Subjects: DATA visualization, WEB-based user interfaces, APPLICATION software, DATA libraries, COMPUTER software
Abstract: Background: The results of high-throughput biology ('omic') experiments provide insight into biological mechanisms but can be challenging to explore, archive and share. The scale of these challenges continues to grow as omic research volume expands and multiple analytical technologies, bioinformatic pipelines, and visualization preferences have emerged. Multiple software applications exist that support omic study exploration and/or archival. However, an opportunity remains for open-source software that can archive and present the results of omic analyses with broad accommodation of study-specific analytical approaches and visualizations with useful exploration features. Results: We present OmicNavigator, an R package for the archival, visualization and interactive exploration of omic studies. OmicNavigator enables bioinformaticians to create web applications that interactively display their custom visualizations and analysis results linked with app-derived analytical tools, graphics, and tables. Studies created with OmicNavigator can be viewed within an interactive R session or hosted on a server for shared access. Conclusions: OmicNavigator can be found at https://github.com/abbvie-external/OmicNavigator [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. GbyE: an integrated tool for genome widely association study and genome selection based on genetic by environmental interaction.

Author: Liu, Xinrui, Wang, Mingxiu, Qin, Jie, Liu, Yaxin, Wang, Shikai, Wu, Shiyu, Zhang, Ming, Zhong, Jincheng, and Wang, Jiabo
Subjects: GENOMES, KRONECKER products, GENETIC markers, CHROMOSOMES, STATISTICAL power analysis, GENOTYPE-environment interaction, GENETIC correlations
Abstract: Background: The growth and development of organism were dependent on the effect of genetic, environment, and their interaction. In recent decades, lots of candidate additive genetic markers and genes had been detected by using genome-widely association study (GWAS). However, restricted to computing power and practical tool, the interactive effect of markers and genes were not revealed clearly. And utilization of these interactive markers is difficult in the breeding and prediction, such as genome selection (GS). Results: Through the Power-FDR curve, the GbyE algorithm can detect more significant genetic loci at different levels of genetic correlation and heritability, especially at low heritability levels. The additive effect of GbyE exhibits high significance on certain chromosomes, while the interactive effect detects more significant sites on other chromosomes, which were not detected in the first two parts. In prediction accuracy testing, in most cases of heritability and genetic correlation, the majority of prediction accuracy of GbyE is significantly higher than that of the mean method, regardless of whether the rrBLUP model or BGLR model is used for statistics. The GbyE algorithm improves the prediction accuracy of the three Bayesian models BRR, BayesA, and BayesLASSO using information from genetic by environmental interaction (G × E) and increases the prediction accuracy by 9.4%, 9.1%, and 11%, respectively, relative to the Mean value method. The GbyE algorithm is significantly superior to the mean method in the absence of a single environment, regardless of the combination of heritability and genetic correlation, especially in the case of high genetic correlation and heritability. Conclusions: Therefore, this study constructed a new genotype design model program (GbyE) for GWAS and GS using Kronecker product. which was able to clearly estimate the additive and interactive effects separately. The results showed that GbyE can provide higher statistical power for the GWAS and more prediction accuracy of the GS models. In addition, GbyE gives varying degrees of improvement of prediction accuracy in three Bayesian models (BRR, BayesA, and BayesCpi). Whatever the phenotype were missed in the single environment or multiple environments, the GbyE also makes better prediction for inference population set. This study helps us understand the interactive relationship between genomic and environment in the complex traits. The GbyE source code is available at the GitHub website (https://github.com/liu-xinrui/GbyE). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. SPIRO – the automated Petri plate imaging platform designed by biologists, for biologists.

Author: Ohlsson, Jonas A., Leong, Jia Xuan, Elander, Pernilla H., Ballhaus, Florentine, Holla, Sanjana, Dauphinee, Adrian N., Johansson, Johan, Lommel, Mark, Hofmann, Gero, Betnér, Staffan, Sandgren, Mats, Schumacher, Karin, Bozhkov, Peter V., and Minina, Elena A.
Subjects: BIOLOGISTS, GERMINATION, ROOT growth, PLANT growth, REMOTE control, IMAGE processing
Abstract: SUMMARY: Phenotyping of model organisms grown on Petri plates is often carried out manually, despite the procedures being time‐consuming and laborious. The main reason for this is the limited availability of automated phenotyping facilities, whereas constructing a custom automated solution can be a daunting task for biologists. Here, we describe SPIRO, the Smart Plate Imaging Robot, an automated platform that acquires time‐lapse photographs of up to four vertically oriented Petri plates in a single experiment, corresponding to 192 seedlings for a typical root growth assay and up to 2500 seeds for a germination assay. SPIRO is catered specifically to biologists' needs, requiring no engineering or programming expertise for assembly and operation. Its small footprint is optimized for standard incubators, the inbuilt green LED enables imaging under dark conditions, and remote control provides access to the data without interfering with sample growth. SPIRO's excellent image quality is suitable for automated image processing, which we demonstrate on the example of seed germination and root growth assays. Furthermore, the robot can be easily customized for specific uses, as all information about SPIRO is released under open‐source licenses. Importantly, uninterrupted imaging allows considerably more precise assessment of seed germination parameters and root growth rates compared with manual assays. Moreover, SPIRO enables previously technically challenging assays such as phenotyping in the dark. We illustrate the benefits of SPIRO in proof‐of‐concept experiments which yielded a novel insight on the interplay between autophagy, nitrogen sensing, and photoblastic response. Significance Statement: PIRO addresses the main bottleneck preventing widespread use of many fantastic custom‐made phenotyping platforms: feasibility to replicate the system without access to engineering skills. SPIRO is designed to be built and operated with no training in engineering or programming, it enables continuous imaging within standard plant growth cabinets under both day and night conditions, seamlessly integrating into existing laboratory workflows; provides exceptional data quality, thereby facilitating a new research standard for all laboratories. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Logic in the deep end.

Author: Leach-Krouse, Graham, Logan, Shay Allen, and Worley, Blane
Subjects: LOGIC, SUBSTITUTION (Logic), RELEVANCE logic, AXIOMATIC recursion theory, RELEVANCE
Abstract: Weak enough relevant logics are often closed under depth substitutions. To determine the breadth of logics with this feature, we show there is a largest sublogic of R closed under depth substitutions and that this logic can be recursively axiomatized. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. Check your outliers! An introduction to identifying statistical outliers in R with easystats.

Author: Thériault, Rémi, Ben-Shachar, Mattan S., Patil, Indrajeet, Lüdecke, Daniel, Wiernik, Brenton M., and Makowski, Dominique
Subjects: OUTLIER detection, PRODUCTION standards, STATISTICAL software, BEST practices
Abstract: Beyond the challenge of keeping up to date with current best practices regarding the diagnosis and treatment of outliers, an additional difficulty arises concerning the mathematical implementation of the recommended methods. Here, we provide an overview of current recommendations and best practices and demonstrate how they can easily and conveniently be implemented in the R statistical computing software, using the {performance} package of the easystats ecosystem. We cover univariate, multivariate, and model-based statistical outlier detection methods, their recommended threshold, standard output, and plotting methods. We conclude by reviewing the different theoretical types of outliers, whether to exclude or winsorize them, and the importance of transparency. A preprint of this paper is available at: 10.31234/osf.io/bu6nt. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Producing Fast and Convenient Machine Learning Benchmarks in R with the stressor Package.

Author: HAYCOCK, SAM, BEAN, BRENNAN, and BURCHFIELD, EMILY
Subjects: MACHINE learning, PYTHON programming language, CROP yields, AGRICULTURE, RESEARCH personnel
Abstract: The programming overhead required to implement machine learning workflows creates a barrier for many discipline-specific researchers with limited programming experience. The stressor package provides an R interface to Python's PyCaret package, which automatically tunes and trains 14-18 machine learning (ML) models for use in accuracy comparisons. In addition to providing an R interface to PyCaret, stressor also contains functions that facilitate synthetic data generation and variants of cross-validation that allow for easy benchmarking of the ability of machine-learning models to extrapolate or compete with simpler models on simpler data forms. We show the utility of stressor on two agricultural datasets, one using classification models to predict crop suitability and another using regression models to predict crop yields. Full ML benchmarking workflows can be completed in only a few lines of code with relatively small computational cost. The results, and more importantly the workflow, provide a template for how applied researchers can quickly generate accuracy comparisons of many machine learning models with very little programming. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. Introduction to Reproducible Geospatial Analysis and Figures in R: A Tutorial Article.

Author: Maesen, Philippe and Salingros, Edouard
Subjects: GEOSPATIAL data, VECTOR data, REPRODUCIBLE research, EDUCATIONAL objectives, WORKFLOW
Abstract: The present article is intended to serve an educational purpose for data scientists and students who already have experience with the R language and which to start using it for geospatial analysis and map creation. The basic concepts of raster data, vector data, CRS and datum are first presented along with a basic workflow to conduct reproducible geospatial research in R. Examples of important types of maps (scatter, bubble, choropleth, hexbin and faceted) created from open-source environmental data are illustrated and their practical implementation in R is discussed. Through these examples, essential manipulations on geospatial vector data are demonstrated (reading, transforming CRS, creating geometries from scratch, buffer zones around existing geometries and intersections between geometries). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. An efficient eavesdropping model for detection of advanced persistent threat (APT) in high volume network traffic.

Author: Veena, R. C. and Brahmananda, S. H.
Abstract: Eavesdropping, commonly referred to as network analysis, is the process of gathering data traffic. To check if attackers are sneaking into a network, a thorough examination is essential. The risk of APT has considerably increased as a result of the rapid expansion of internet use and linked gadgets. The goal of this research is to develop an eavesdropping model. To train the developed system, the publicly available dataset having a range of simulated breaches in a military-grade network environment is used. The model can examine, decode, and display malicious data packets from commonly used protocols. The objective is to determine whether a threat might be present in the network. Before the firewall, a program keeps track of data transfer over a network. The detection model's use of historical learning of publicly accessible threat patterns is what makes this study novel. Among the features is a reliable model for APT detection, an intuitive user interface, and statistical capabilities to analyze. With an accuracy of 99.99% and a detection time of 0.2 seconds, Random Forest provided the greatest classification performance. The acquired accuracy is higher than the 98.85% accuracy that was previously published. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. Using plausible values when fitting multilevel models with large-scale assessment data using R.

Author: Huang, Francis L.
Subjects: MULTILEVEL models, DATA analysis
Abstract: The use of large-scale assessments (LSAs) in education has grown in the past decade though analysis of LSAs using multilevel models (MLMs) using R has been limited. A reason for its limited use may be due to the complexity of incorporating both plausible values and weighted analyses in the multilevel analyses of LSA data. We provide additional functions in R that extend the functionality of the WeMix (Bailey et al., 2023) package to allow for the automatic pooling of plausible values. In addition, functions for model comparisons using plausible values and the ability to export output to different formats (e.g., Word, html) are also provided. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. With super SDMs (machine learning, open access big data, and the cloud) towards more holistic global squirrel hotspots and coldspots.

Author: Steiner, Moriz, Huettmann, F., Bryans, N., and Barker, B.
Subjects: BIG data, MACHINE learning, SQUIRRELS, NUMBERS of species, SPECIES distribution
Abstract: Species-habitat associations are correlative, can be quantified, and used for powerful inference. Nowadays, Species Distribution Models (SDMs) play a big role, e.g. using Machine Learning and AI algorithms, but their best-available technical opportunities remain still not used for their potential e.g. in the policy sector. Here we present Super SDMs that invoke ML, OA Big Data, and the Cloud with a workflow for the best-possible inference for the 300 + global squirrel species. Such global Big Data models are especially important for the many marginalized squirrel species and the high number of endangered and data-deficient species in the world, specifically in tropical regions. While our work shows common issues with SDMs and the maxent algorithm ('Shallow Learning'), here we present a multi-species Big Data SDM template for subsequent ensemble models and generic progress to tackle global species hotspot and coldspot assessments for a more inclusive and holistic inference. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. Optimizing large real‐world data analysis with parquet files in R: A step‐by‐step tutorial.

Author: Abdelaziz, Abdullah I., Hanson, Kent A., Gaber, Charles E., and Lee, Todd A.
Abstract: Purpose: The use of open‐source programming languages can facilitate open science practices in real‐world evidence (RWE) studies. Real‐world studies often rely on using big data, which makes using such languages complicated. We demonstrate an efficient approach that enables RWE researchers to use R to undertake RWE analysis tasks from cohort building to reporting. Methods: Using the Merative Marketscan data (2017–2019), we developed an R function to transform the data into parquet format to be used in R. Then, we compared the differences in data size before and after transformation. We compared the performance of the transformed data in R to the original data in terms of numerical consistency and running times required to complete simple exploratory tasks. To show how the transformed databases can be used in practice, we conducted a simplified replication of an active comparator new user study from the literature. All codes are available on GitHub. Results: Our approach exhibited high efficiency in data storage, as evidenced by the converted data size, which ranged from 10% to 43% of that of the original data files. The runtime of the exploratory tasks in R generally outperformed that of the original data with SAS. We showed, through example, how the converted data can be efficiently used to implement an RWE study. Conclusion: We demonstrate a free and efficient solution to facilitate the use of open‐source programming languages with big real‐world databases, which can facilitate the adoption of open science practices. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. REDCapDM: An R package with a set of data management tools for a REDCap project.

Author: Carmezim, João, Satorra, Pau, Peñafiel, Judith, García-Lerma, Esther, Pallarès, Natàlia, Santos, Naiara, and Tebé, Cristian
Subjects: DATA management, WEB-based user interfaces, ONLINE databases, ELECTRONIC data processing, INTERNET surveys
Abstract: Background: Research Electronic Data CAPture (REDCap) is a web application for creating and managing online surveys and databases. Clinical data management is an essential process before performing any statistical analysis to ensure the quality and reliability of study information. Processing REDCap data in R can be complex and often benefits from automation. While there are several R packages available for specific tasks, none offer an expansive approach to data management. Results: The REDCapDM is an R package for accessing and managing REDCap data. It imports data from REDCap to R using either an API connection or the files in R format exported directly from REDCap. It has several functions for data processing and transformation, and it helps to generate and manage queries to clarify or resolve discrepancies found in the data. Conclusion: The REDCapDM package is a valuable tool for data scientists and clinical data managers who use REDCap and R. It assists in tasks such as importing, processing, and quality-checking data from their research studies. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. Primer on Reproducible Research in R: Enhancing Transparency and Scientific Rigor.

Author: Siraji, Mushfiqul Anwar and Rahman, Munia
Subjects: REPRODUCIBLE research, SCIENTIFIC knowledge, PROGRAMMING languages, SOMNOLOGY, RESEARCH personnel
Abstract: Achieving research reproducibility is a precarious aspect of scientific practice. However, many studies across disciplines fail to be fully reproduced due to inadequate dissemination methods. Traditional publication practices often fail to provide a comprehensive description of the research context and procedures, hindering reproducibility. To address these challenges, this article presents a tutorial on reproducible research using the R programming language. The tutorial aims to equip researchers, including those with limited coding knowledge, with the necessary skills to enhance reproducibility in their work. It covers three essential components: version control using Git, dynamic document creation using rmarkdown, and managing R package dependencies with renv. The tutorial also provides insights into sharing reproducible research and offers specific considerations for the field of sleep and chronobiology research. By following the tutorial, researchers can adopt practices that enhance the transparency, rigor, and replicability of their work, contributing to a culture of reproducible research and advancing scientific knowledge. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. recolorize: An R package for flexible colour segmentation of biological images.

Author: Weller, Hannah I., Hiller, Anna E., Lord, Nathan P., and Van Belleghem, Steven M.
Subjects: FLEXIBLE packaging, IMAGE segmentation, COLOR, BATCH processing, BIOLOGICAL variation
Abstract: Colour pattern variation provides biological information in fields ranging from disease ecology to speciation dynamics. Comparing colour pattern geometries across images requires colour segmentation, where pixels in an image are assigned to one of a set of colour classes shared by all images. Manual methods for colour segmentation are slow and subjective, while automated methods can struggle with high technical variation in aggregate image sets. We present recolorize, an R package toolbox for human‐subjective colour segmentation with functions for batch‐processing low‐variation image sets and additional tools for handling images from diverse (high‐variation) sources. The package also includes export options for a variety of formats and colour analysis packages. This paper illustrates recolorize for three example datasets, including high variation, batch processing and combining with reflectance spectra, and demonstrates the downstream use of methods that rely on this output. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. NeuroDecodeR: a package for neural decoding in R.

Author: Meyers, Ethan M.
Subjects: RESEARCH personnel, MODULAR design, PACKAGING design, DATA analysis, FASHION design, NEUROSCIENCES
Abstract: Neural decoding is a powerful method to analyze neural activity. However, the code needed to run a decoding analysis can be complex, which can present a barrier to using the method. In this paper we introduce a package that makes it easy to perform decoding analyses in the R programing language. We describe how the package is designed in a modular fashion which allows researchers to easily implement a range of different analyses. We also discuss how to format data to be able to use the package, and we give two examples of how to use the package to analyze real data. We believe that this package, combined with the rich data analysis ecosystem in R, will make it significantly easier for researchers to create reproducible decoding analyses, which should help increase the pace of neuroscience discoveries. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. cytoviewer: an R/Bioconductor package for interactive visualization and exploration of highly multiplexed imaging data.

Author: Meyer, Lasse, Eling, Nils, and Bodenmiller, Bernd
Subjects: DATA visualization, GRAPHICAL user interfaces, PROGRAMMING languages, IMAGE analysis, BIOMOLECULES, QUALITY control
Abstract: Background: Highly multiplexed imaging enables single-cell-resolved detection of numerous biological molecules in their spatial tissue context. Interactive visualization of multiplexed imaging data is crucial at any step of data analysis to facilitate quality control and the spatial exploration of single cell features. However, tools for interactive visualization of multiplexed imaging data are not available in the statistical programming language R. Results: Here, we describe cytoviewer, an R/Bioconductor package for interactive visualization and exploration of multi-channel images and segmentation masks. The cytoviewer package supports flexible generation of image composites, allows side-by-side visualization of single channels, and facilitates the spatial visualization of single-cell data in the form of segmentation masks. As such, cytoviewer improves image and segmentation quality control, the visualization of cell phenotyping results and qualitative validation of hypothesis at any step of data analysis. The package operates on standard data classes of the Bioconductor project and therefore integrates with an extensive framework for single-cell and image analysis. The graphical user interface allows intuitive navigation and little coding experience is required to use the package. We showcase the functionality and biological application of cytoviewer by analysis of an imaging mass cytometry dataset acquired from cancer samples. Conclusions: The cytoviewer package offers a rich set of features for highly multiplexed imaging data visualization in R that seamlessly integrates with the workflow for image and single-cell data analysis. It can be installed from Bioconductor via https://www.bioconductor.org/packages/release/bioc/html/cytoviewer.html. The development version and further instructions can be found on GitHub at https://github.com/BodenmillerGroup/cytoviewer. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. Examining parallelization in kernel regression.

Author: Oltulu, Orcun and Gokalp Yavuz, Fulya
Subjects: PARALLEL algorithms, PROGRAMMING languages, PARALLEL processing, RESEARCH personnel, PARALLEL programming
Abstract: For a few decades, parallelization in statistical computing has been an increasing trend, and researchers have put significant effort into converting or adjusting known statistical methods and algorithms in parallel. The main reasons for the transition to parallel processes are the rapid growth in the size and the volume of data and the accelerated hardware developments. Divide and (re)combine (DnR) is one of the parallelization methods that allows the existing data or method to be implemented by dividing it into smaller pieces. It is possible to use the DnR method in most regression methods to reveal the relationship between the data. Although several libraries have been created in existing programming languages for many regression methods, such an approach is not yet used for kernel regression. However, it should be kept in mind that the kernel regression calculation method takes a relatively long time. For this reason, parallelization would be a handy strategy to decrease the calculation time in kernel regression. In this study, we aim to demonstrate how time efficiency is achieved using DnR methods for kernel regression with the help of several parallelization strategies in R. The results indicate that the computation time can be reduced proportionally with a trade-off between time and accuracy. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. The FAIRification of research in real‐world evidence: A practical introduction to reproducible analytic workflows using Git and R.

Author: Weberpals, Janick and Wang, Shirley V.
Abstract: Transparency and reproducibility are major prerequisites for conducting meaningful real‐world evidence (RWE) studies that are fit for decision‐making. Many advances have been made in the documentation and reporting of study protocols and results, but the principles for version control and sharing of analytic code in RWE are not yet as established as in other quantitative disciplines like computational biology and health informatics. In this practical tutorial, we aim to give an introduction to distributed version control systems (VCS) tailored toward the FAIR (Findable, Accessible, Interoperable, and Reproducible) implementation of RWE studies. To ease adoption, we provide detailed step‐by‐step instructions with practical examples on how the Git VCS and R programming language can be implemented into RWE study workflows to facilitate reproducible analyzes. We further discuss and showcase how these tools can be used to track changes, collaborate, disseminate, and archive RWE studies through dedicated project repositories that maintain a complete audit trail of all relevant study documents. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. An Identification of Metastasis Regulators in Chicken (Gallus Gallus) Sarcoma Cell Lines Using Transcriptomic Data.

Author: Doan, Nhu P. Y. and Szarvas, Adrienn
Subjects: CHICKENS, ROUS sarcoma, CELL lines, SARCOMA, TRANSCRIPTOMES
Abstract: Rous sarcoma virus (RSV), which is an oncovirus, can cause sarcoma and consequently induce malignant tumours in chicken (Gallus gallus). Research into molecular factors that regulate the tumour-inducing ability are essential to develop prevention and curation methods against RSV. In this study, we aimed to determine candidate genes contributing to the formation of tumours through a transcriptomic analysis in R programming with GSE42516 and GSE15141, which are microarray expression dataset in GEO-NCBI database. We conducted differential expression analysis among a total of 8 metastatic samples and 5 non-metastatic samples, starting from data normalization, then creating model matrixes for pairwise comparisons and using eBayes function to calculate the log fold chance values and significance level of all genes (p-value). As a result, in GSE42516, we identified 295 significant (p-value ≤ 0.05) differentially expressed genes (DEGs), with 195 downregulated genes (logFC ≤ -1) and 190 upregulated genes (logFC ≥ 1). While in GSE15141, a greater list of DEGs was extracted, with 1444 downregulated genes and 1314 upregulated genes. Top 5 DEGs retrieved in GSE42516 were TTC32, DHRS7, RARB, RSPO3, C1QB and RBM24, TOM1L1, LIPI, HINTW, C20orf59 were found in GSE15141. Enrichment GO (in this case, biological process - BP) analysis revealed that the DEGs are mainly enriched in heterochromatin assembly, negative regulation of megakaryocyte differentiation and endocytosis. The identified genes may have a vital role in elucidating the molecular metastasis mechanisms and developing effective strategies against sarcoma virus. [ABSTRACT FROM AUTHOR]
Published: 2024

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

1,008 results on '"R"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources