1,477 results on '"Data mining"'
Search Results
2. Ethnic Density as a Key Factor to Narrow Health Disparities: A Case of American Indians and Alaska Natives.
- Author
-
Kim, Yong-Mi and Noyori-Corbett, Chie
- Subjects
- *
ALASKA Natives , *DATA mining , *PROFESSIONAL practice , *SOCIOECONOMIC factors , *SOCIAL services , *EVALUATION of medical care , *SOCIAL work research , *RESEARCH , *ELECTRONIC health records , *HEALTH behavior , *HEALTH equity , *MINORITIES , *SOCIODEMOGRAPHIC factors , *CONFIDENCE intervals , *HEALTH promotion , *PSYCHOLOGY of Native Americans - Abstract
Out of all the racial groups in the United States, people who identify as American Indian and Alaska Native (AI/AN) have disproportionately worse health as a result of living in poverty. The preponderance of research connects poor health with a socioeconomic perspective, which might create prejudice against AI/AN. As already known, AI/AN's high rates of obesity, diabetes, and stroke in comparison with that of other ethnic groups are mainly derived from their impoverished economic conditions that have forced them to consume the food distributed by the U.S. government. When minority health is discussed generally, the ethnic density perspective explains a minority population's positive health despite low socioeconomic status. This perspective helps researchers and practitioners understand the connections of psychological and social factors with physical health and demonstrates positive health effects on minority groups. Despite the high correlation between ethnic density and health having been validated, little to no research has explored AI/AN's health from this perspective. Using 13,064 electronic health records, this research tests the relationship between AI/AN density and health outcomes. This article introduces an innovative analytical strategy (i.e. a data mining technique), which is ideal for discovering frequently appearing health outcomes in a group. The finding reveals positive relationships between health outcomes and AI/AN density. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Copyright and Text and Data Mining: Is the Current Legislation Sufficient and Adequate?
- Author
-
Fernández-Molina, Juan-Carlos and de la Rosa, Fernando Esteban
- Subjects
- *
COPYRIGHT , *DATA mining , *ACADEMIC libraries , *CONCEPTUAL structures , *PROFESSIONAL licenses , *COMPARATIVE studies , *ACCESS to information - Abstract
Text and data mining activities—that is, the automated processing of digital materials to uncover new knowledge—have become more frequent in all areas of scientific research. Because they require a massive use of copyrighted work, there are evident conflicts with copyright legislation. Countries at the forefront of research and development have begun to address this issue. This paper presents the basic aspects of legislation applicable to text and data mining activities. It offers a detailed comparative analysis of the norms of the main jurisdictions that have regulated them to date, highlighting in each case the positive and negative aspects. An adequate knowledge of these laws is not only important for researchers but also important for the academic librarians who provide advice and support in these matters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. American Twitter users revealed social determinant-related oral health disparities amid the COVID-19 pandemic.
- Author
-
Yangxin Fan, Hanjia Lyu, Jin Xiao, and Jiebo Luo
- Subjects
POPULATION density ,DISCUSSION ,ORAL health ,SELF-evaluation ,INCOME ,WATER fluoridation ,SOCIOECONOMIC disparities in health ,HEALTH insurance ,DESCRIPTIVE statistics ,POPULATION health ,POVERTY ,LOGISTIC regression analysis ,COVID-19 pandemic ,PUBLIC opinion - Abstract
Objectives: To assess self-reported population oral health conditions amid the COVID-19 pandemic using user reports on Twitter. Method and materials: Oral health-related tweets during the COVID-19 pandemic were collected from 9,104 Twitter users across 26 states (with sufficient samples) in the United States between 12 November 2020 and 14 June 2021. User demographics were inferred by leveraging the visual information from the user profile images. Other characteristics including income, population density, poverty rate, health insurance coverage rate, community water fluoridation rate, and relative change in the number of daily confirmed COVID-19 cases were acquired or inferred based on retrieved information from user profiles. Logistic regression was performed to examine whether discussions vary across user characteristics. Results: Overall, 26.70% of the Twitter users discussed "Wisdom tooth pain/jaw hurt," 23.86% tweeted about "Dental service/cavity," 18.97% discussed "Chipped tooth/tooth break," 16.23% talked about "Dental pain," and the rest tweeted about "Tooth decay/gum bleeding." Women and younger adults (19 to 29 years) were more likely to talk about oral health problems. Health insurance coverage rate was the most significant predictor in logistic regression for topic prediction. Conclusion: Tweets inform social disparities in oral health during the pandemic. For instance, people from counties at a higher risk of COVID-19 talked more about "Tooth decay/gum bleeding" and "Chipped tooth/tooth break." Older adults, who are vulnerable to COVID-19, were more likely to discuss "Dental pain." Topics of interest varied across user characteristics. Through the lens of social media, these findings may provide insights for oral health practitioners and policy makers. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. Exploring the Landscape of Immune Checkpoint Inhibitor-Induced Adverse Events Through Big Data Mining of Pan-Cancer Clinical Trials.
- Author
-
Fadlullah, Muhammad Zaki Hidayatullah, Lin, Ching-Nung, Coleman, Samuel, Young, Arabella, Naqash, Abdul Rafeh, Hu-Lieskovan, Siwen, and Tan, Aik Choon
- Subjects
RISK assessment ,DATA mining ,DRUG side effects ,RESEARCH funding ,CLINICAL trials ,DATA analytics ,IMMUNE checkpoint inhibitors ,LONGITUDINAL method ,TUMORS ,DISEASE incidence - Abstract
Purpose Immune checkpoint inhibitors (ICIs) have significantly improved the survival of patients with cancer and provided long-term durable benefit. However, ICI-treated patients develop a range of toxicities known as immune-related adverse events (irAEs), which could compromise clinical benefits from these treatments. As the incidence and spectrum of irAEs differs across cancer types and ICI agents, it is imperative to characterize the incidence and spectrum of irAEs in a pan-cancer cohort to aid clinical management. Design We queried >400 000 trials registered at ClinicalTrials.gov and retrieved a comprehensive pan-cancer database of 71 087 ICI-treated participants from 19 cancer types and 7 ICI agents. We performed data harmonization and cleaning of these trial results into 293 harmonized adverse event categories using Medical Dictionary for Regulatory Activities. Results We developed irAExplorer (https://irae.tanlab.org), an interactive database that focuses on adverse events in patients administered with ICIs from big data mining. irAExplorer encompasses 71 087 distinct clinical trial participants from 343 clinical trials across 19 cancer types with well-annotated ICI treatment regimens and harmonized adverse event categories. We demonstrated a few of the irAE analyses through irAExplorer and highlighted some associations between treatment- or cancer-specific irAEs. Conclusion The irAExplorer is a user-friendly resource that offers exploration, validation, and discovery of treatment- or cancer-specific irAEs across pan-cancer cohorts. We envision that irAExplorer can serve as a valuable resource to cross-validate users' internal datasets to increase the robustness of their findings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. State-Federal Vocational Rehabilitation Services, Demographic Characteristics and Employment Outcomes for Native Americans with Mental Illnesses.
- Author
-
Salimi, Nahal, Gere, Bryan, and Shahab, Amin
- Subjects
- *
NATIVE Americans , *EMPLOYMENT of people with disabilities , *RESEARCH methodology , *MENTAL health , *PUBLIC health , *REHABILITATION counselors , *REHABILITATION of people with mental illness , *GOVERNMENT programs , *COMPARATIVE studies , *EXPERIENCE , *CHI-squared test , *VOCATIONAL rehabilitation , *SOCIODEMOGRAPHIC factors , *HEALTH equity , *PREDICTION models , *PUBLIC welfare , *SUPPORTED employment , *DATA mining - Abstract
There were 9.7 million Native Americans (American Indian, Alaska Native-AI/AN- these acronyms will be used interchangeably with Native Americans throughout the paper) in 2019 comprising 2.9% of the U.S. population. Native American populations have disproportionately higher rates of mental illnesses compared to other racial groups in the U.S. Mental health is a significant public health concern for this population, impacting different areas of their lives including employment. Additionally, Native Americans continue to experience significant disparities in access to Vocational Rehabilitation (VR) services and have poor employment outcomes. However, little is known about the relationships among demographic factors, vocational rehabilitation services, and employment outcomes of Native Americans with mental illness. Consequently, the current study examined how demographic factors and VR services are related to successful employment outcomes for Native American VR clients with mental illnesses using data from the Rehabilitation Services Administration (RSA) program year (2019) Case Service Report (9–11). Both descriptive analysis and data mining approaches were used to answer the research questions. Chi-square Automatic Interaction Detector (CHAID) analysis was used to determine which of the VR services could best predict the successful employment outcome of Native Americans with mental illness. The findings of the data mining approach revealed that among all the vocational rehabilitation services, job placement assistance was the strongest predictor of successful employment among Native American clients with mental illnesses. The second most important service predicting successful employment for those who received job placement assistance was shown to be maintenance. Implications for rehabilitation counselors and future research are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Diverting Data and Drugs: A Narrative Review of the Mallinckrodt Documents.
- Author
-
Lentacker, Antoine, Pham, Kelly, and Chernesky, Jason M.
- Subjects
- *
DOCUMENTATION , *PRODUCT safety , *DATA mining , *DATABASE management , *CONTROLLED substances , *PHARMACEUTICAL industry , *OPIOID analgesics , *OPIOID epidemic , *DRUGS - Abstract
U.S. law imposes strict recording and reporting requirements on all entities that manufacture and distribute controlled substances. As a result, the prescription opioid crisis has unfolded in a data-saturated environment. This article asks why the systematic documentation of opioid transactions failed to prevent or mitigate the crisis. Drawing on a recently disclosed trove of 1.4 million internal records from Mallinckrodt Pharmaceuticals, a leading manufacturer of prescription opioids, we highlight a phenomenon we propose to call data diversion, whereby data ostensibly generated or collected for the purpose of regulating the distribution of controlled substances were repurposed by the industry for the opposite aim of increasing sales at all costs. Systematic data diversion, we argue, contributed substantially to the scale of drug diversion seen with opioids and should become a focus of policy intervention. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Emerging Trends of Self-Harm Using Sodium Nitrite in an Online Suicide Community: Observational Study Using Natural Language Processing Analysis.
- Author
-
Das, Sudeshna, Walker, Drew, Rajwal, Swati, Lakamana, Sahithi, Sumner, Steven A., Mack, Karin A., Kaczkowski, Wojciech, and Sarker, Abeed
- Subjects
SUICIDE prevention ,DATA mining ,DATA analysis ,SCIENTIFIC observation ,ANTIEMETICS ,NATURAL language processing ,INTERNET ,DESCRIPTIVE statistics ,SELF-mutilation ,NITRITES ,STATISTICS ,MACHINE learning ,COMPARATIVE studies ,EPIDEMIOLOGICAL research - Abstract
Background: There is growing concern around the use of sodium nitrite (SN) as an emerging means of suicide, particularly among younger people. Given the limited information on the topic from traditional public health surveillance sources, we studied posts made to an online suicide discussion forum, "Sanctioned Suicide," which is a primary source of information on the use and procurement of SN. Objective: This study aims to determine the trends in SN purchase and use, as obtained via data mining from subscriber posts on the forum. We also aim to determine the substances and topics commonly co-occurring with SN, as well as the geographical distribution of users and sources of SN. Methods: We collected all publicly available from the site's inception in March 2018 to October 2022. Using data-driven methods, including natural language processing and machine learning, we analyzed the trends in SN mentions over time, including the locations of SN consumers and the sources from which SN is procured. We developed a transformer-based source and location classifier to determine the geographical distribution of the sources of SN. Results: Posts pertaining to SN show a rise in popularity, and there were statistically significant correlations between real-life use of SN and suicidal intent when compared to data from the Centers for Disease Control and Prevention (CDC) Wide-Ranging Online Data for Epidemiologic Research (ρ=0.727; P<.001) and the National Poison Data System (ρ=0.866; P=.001). We observed frequent co-mentions of antiemetics, benzodiazepines, and acid regulators with SN. Our proposed machine learning--based source and location classifier can detect potential sources of SN with an accuracy of 72.92% and showed consumption in the United States and elsewhere. Conclusions: Vital information about SN and other emerging mechanisms of suicide can be obtained from online forums. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. The Current Landscape of Faculty Developers in Scholarship of Teaching and Learning Across Diverse Campuses in the United States.
- Author
-
Ko, Melissa E.
- Subjects
- *
SCHOLARSHIPS , *EVIDENCE-based education , *DATA mining , *EDUCATIONAL planning - Abstract
Centers for Teaching and Learning (CTLs) are uniquely poised to support instructors engaging in Scholarship of Teaching and Learning (SoTL) through professional expertise in evidence-based teaching practice and dedicated staff resources. Models for this support have ranged from purely a funding source, to learning communities, to one-off technical training and consultations, to comprehensive mentoring and partnerships. In the decade since Schwartz and Haynie published "Faculty Development Centers and the Role of SoTL," we aimed to profile the current landscape of university CTLs and their involvement in SoTL. In this review, we draw on the multiple models of CTL participation in SoTL developed by Lukes et al. to categorize the work conducted at a sample of American institutions. Using a data mining approach of publicly available information online, we compiled a sample dataset that shows the distribution of CTLs across the US engaging in various forms of SoTL. We examine current trends of CTL and SoTL presence amongst institution types and geographic regions, with consideration for different SoTL program models. We conclude with a discussion of the current landscape of CTLs and their SoTL involvement compared to our aspirations: what will the future of faculty development look like, and what role will SoTL play? Given the pros and cons of each different model for CTL and SoTL integration, is the current distribution of these models as effective as it could be? What changes could lead to greater impact both for CTLs and for SoTL? [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. 5. Data intensity in the United Kingdom, Canada and the United States.
- Author
-
Schmidt, Julia, Pilgrim, Graham, and Mourougane, Annabelle
- Subjects
DATA mining ,JOB postings ,INTERNET advertising ,OCCUPATIONS - Abstract
The article discusses research which presented the results of an analysis of the data intensity of jobs derived from the online job advertisements data provided by Lightcast, a private data provider, from Great Britain, Canada and the U.S. Topics include similarity of data intensity by occupation across the three countries, the variations in the importance of data-intensive jobs in almost every industrial sector and derivation of estimates of investment in data.
- Published
- 2023
- Full Text
- View/download PDF
11. Mapping the field of physical therapy and identification of the leading active producers. A bibliometric analysis of the period 2000- 2018.
- Author
-
Carballo-Costa, Lidia, Quintela-Del-Río, Alejandro, Vivas-Costa, Jamile, and Costas, Rodrigo
- Subjects
- *
BIBLIOMETRICS , *CITATION analysis , *RESEARCH funding , *PHYSICAL therapy research , *DATA analysis software , *THEMATIC analysis , *DATA mining - Abstract
The objectives of the study were: 1) Describe the thematic structure and evolution of the field of physical therapy; 2) identify the main research producers (i.e. countries and institutions); and 3) compare their research output and citation impact. Papers related to physical therapy indexed in Web of Science (2000–2018) were identified to delineate the field, using keywords, journals, and citation networks. VOSviewer software, advanced bibliometric text mining, and visualization techniques were used to evaluate the thematic structure. We collected data about the country and institutional affiliation of all the authors and calculated production and citation impact indicators. 85,697 papers were analyzed. Eleven thematic clusters were identified: 1) "health care and education"; 2) "biomechanics"; 3) "psychosocial, chronic pain and quality of life outcomes"; 4) "evidence-based physical therapy research methods"; 5) "traumatology and orthopedics"; 6) "neurological rehabilitation"; 7) "psychometrics and cross-cultural adaptation"; 8) "gait-balance analysis and Parkinson's disease"; 9) "exercise"; 10) "respiratory physical therapy"; and 11) "back pain." The United States, the United Kingdom, and Australia were the most productive countries. Netherlands, Norway, and Sweden had the highest citation impact. Our bibliometric visualization approach makes it possible to comprehensively study the thematic structure of physical therapy. The ranking of producers has evolved and now includes China and Brazil. High research production does not imply a high citation impact. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. Sentiment, we-talk and engagement on social media: insights from Twitter data mining on the US presidential elections 2020.
- Author
-
Hagemann, Linus and Abramova, Olga
- Subjects
- *
UNITED States presidential election, 2020 , *DATA mining , *SOCIAL media , *USER-generated content , *NEGATIVITY bias , *INFORMATION dissemination , *SENTIMENT analysis - Abstract
Purpose: Given inconsistent results in prior studies, this paper applies the dual process theory to investigate what social media messages yield audience engagement during a political event. It tests how affective cues (emotional valence, intensity and collective self-representation) and cognitive cues (insight, causation, certainty and discrepancy) contribute to public engagement. Design/methodology/approach: The authors created a dataset of more than three million tweets during the 2020 United States (US) presidential elections. Affective and cognitive cues were assessed via sentiment analysis. The hypotheses were tested in negative binomial regressions. The authors also scrutinized a subsample of far-famed Twitter users. The final dataset, scraping code, preprocessing and analysis are available in an open repository. Findings: The authors found the prominence of both affective and cognitive cues. For the overall sample, negativity bias was registered, and the tweet's emotionality was negatively related to engagement. In contrast, in the sub-sample of tweets from famous users, emotionally charged content produced higher engagement. The role of sentiment decreases when the number of followers grows and ultimately becomes insignificant for Twitter participants with many followers. Collective self-representation ("we-talk") is consistently associated with more likes, comments and retweets in the overall sample and subsamples. Originality/value: The authors expand the dominating one-sided perspective to social media message processing focused on the peripheral route and hence affective cues. Leaning on the dual process theory, the authors shed light on the effectiveness of both affective (peripheral route) and cognitive (central route) cues on information appeal and dissemination on Twitter during a political event. The popularity of the tweet's author moderates these relationships. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Have consumers escaped from COVID-19 restrictions by seeking variety? A Machine Learning approach analyzing wine purchase behavior in the United States.
- Author
-
Rinke, Wolfram and Ho, Shuay-Tsyr
- Subjects
CONSUMER behavior ,MACHINE learning ,WINE stores ,CONSUMERS ,COVID-19 ,WINES - Abstract
The COVID-19 pandemic itself constitutes an environment for people to experience the potential loss of control and freedom due to social distancing measures and other government orders. Variety-seeking has been treated as a mechanism to regain a sense of self-control. Using Machine Learning model and household-level data with a focus on the wine market in the United States, this study showcases the changing variety-seeking behavior over the pandemic year of 2020, in which people's perception of the status of restriction measures influences the degree of their use of variety-seeking behavior as a coping strategy. It is the shopping pattern and store environments that drive the behavioral responses in wine purchases to freedom-limited circumstances. Coupon use is associated with a lower variety-seeking tendency at the beginning of the stay-at-home order, but the variety level resumes when more time has passed in the restriction periods. Variety-seeking tendency increases with shopping frequency at the beginning of the social distancing measure but decreases to a level lower than all the non-restriction periods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Marburg Virus Outbreak and a New Conspiracy Theory: Findings from a Comprehensive Analysis and Forecasting of Web Behavior.
- Author
-
Thakur, Nirmalya, Cui, Shuqi, Patel, Kesha A., Azizi, Nazif, Knieling, Victoria, Han, Changhee, Poon, Audrey, and Shah, Rishika
- Subjects
MARBURG virus ,CONSPIRACY theories ,EMERGENCY communication systems ,VIRUS diseases ,FORECASTING ,EVIDENCE gaps - Abstract
During virus outbreaks in the recent past, web behavior mining, modeling, and analysis have served as means to examine, explore, interpret, assess, and forecast the worldwide perception, readiness, reactions, and response linked to these virus outbreaks. The recent outbreak of the Marburg Virus disease (MVD), the high fatality rate of MVD, and the conspiracy theory linking the FEMA alert signal in the United States on 4 October 2023 with MVD and a zombie outbreak, resulted in a diverse range of reactions in the general public which has transpired in a surge in web behavior in this context. This resulted in "Marburg Virus" featuring in the list of the top trending topics on Twitter on 3 October 2023, and "Emergency Alert System" and "Zombie" featuring in the list of top trending topics on Twitter on 4 October 2023. No prior work in this field has mined and analyzed the emerging trends in web behavior in this context. The work presented in this paper aims to address this research gap and makes multiple scientific contributions to this field. First, it presents the results of performing time-series forecasting of the search interests related to MVD emerging from 216 different regions on a global scale using ARIMA, LSTM, and Autocorrelation. The results of this analysis present the optimal model for forecasting web behavior related to MVD in each of these regions. Second, the correlation between search interests related to MVD and search interests related to zombies was investigated. The findings show that there were several regions where there was a statistically significant correlation between MVD-related searches and zombie-related searches on Google on 4 October 2023. Finally, the correlation between zombie-related searches in the United States and other regions was investigated. This analysis helped to identify those regions where this correlation was statistically significant. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Predicting Prices of Case Furniture Products Using Web Mining Techniques.
- Author
-
Bardak, Timucin
- Subjects
- *
CASE goods , *DEEP learning , *FURNITURE sales & prices , *RANDOM forest algorithms , *CHESTS (Furniture) , *ESTIMATION theory , *COAL sales & prices - Abstract
This article presents a methodology based on web mining techniques for estimating furniture prices using e-commerce data. Data on different public e-commerce sites in the United States were collected and analyzed using web mining methods. Deep learning and random forest algorithms were used to predict the prices of different types of furniture. Bookcase and dresser type furniture, which are widely used in price estimation, were selected. The inquiry identified a collection of eight distinctive attributes linked to furniture items, spanning measurements such as width, depth, and height, alongside features encompassing frame material, partition count, drawer count, color, and price. In preparation for constructing predictive models, a dataset comprising 300 instances was compiled for comprehensive analysis. Models developed based on web mining to predict furniture prices gave satisfactory results. During the testing phase, the random forest algorithm outperformed deep learning, achieving high goodness of fit values of 0.89 and 0.94 for bookcase and dresser furniture, respectively. The results indicate that price estimation for dresser furniture was more accurate than for bookcases in all models. The findings demonstrate that web mining techniques can be used effectively in competitive furniture pricing, with potential to save time and cost in pricing for furniture purchasing. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Annex B. Additional results.
- Author
-
Schmidt, Julia, Pilgrim, Graham, and Mourougane, Annabelle
- Subjects
DATA mining ,OCCUPATIONS ,JOB postings ,INDUSTRIES - Published
- 2023
- Full Text
- View/download PDF
17. Cardiovascular and Cerebrovascular Safety of Ranibizumab, Bevacizumab, and Aflibercept in Ocular Diseases: An Analysis of the US Food and Drug Administration Adverse Event Reporting System (FAERS) Database.
- Author
-
Zeng, Yanbin, Guo, Xiaohui, Xiao, Fengjiao, and Zhang, Haixia
- Subjects
- *
HYPERTENSION risk factors , *VASCULAR endothelial growth factor antagonists , *HEART failure risk factors , *THROMBOSIS risk factors , *THERAPEUTIC use of monoclonal antibodies , *CEREBROVASCULAR disease risk factors , *EMBOLISM risk factors , *CARDIOVASCULAR diseases risk factors , *CENTRAL nervous system diseases , *CONFIDENCE intervals , *CROSS-sectional method , *MYOCARDIAL ischemia , *PULMONARY hypertension , *CARDIOMYOPATHIES , *MONOCLONAL antibodies , *LONG QT syndrome , *RISK assessment , *VENTRICULAR tachycardia , *RESEARCH funding , *DESCRIPTIVE statistics , *BEVACIZUMAB , *ARRHYTHMIA , *ODDS ratio , *DRUG side effects , *RECOMBINANT proteins , *EYE diseases , *PATIENT safety , *DATA mining , *DISEASE risk factors - Abstract
The cardiovascular and cerebrovascular safety of ranibizumab, bevacizumab, and aflibercept for ocular diseases is unclear. This study aimed to evaluate and compare the cardiovascular and cerebrovascular safety in patients receiving ranibizumab, bevacizumab, and aflibercept for ocular disease. A cross‐sectional study was conducted from 2017 (Q1) to 2021 (Q4) in the US Food and Drug Administration Adverse Event Reporting System (FAERS) database. The outcomes of interest were central nervous system vascular disorders, ischemic heart disease, hypertension, pulmonary hypertension, torsade de pointes/QT prolongation, embolic and thrombotic events, cardiac arrhythmias, cardiac failure, and cardiomyopathy. Data mining was performed by a disproportional method with a compression, using compressed reporting odds ratios (sRORs) with 95% confidence intervals (CIs) to measure signals. The results showed 1462 cardiovascular and cerebrovascular events associated with aflibercept, 834 with ranibizumab, and 150 with bevacizumab. Ranibizumab, bevacizumab, and aflibercept were linked to central nervous system vascular disorders (sROR, 5.57[95%CI, 4.95‐6.26] vs sROR, 2.23 [95%CI, 1.75‐2.85] vs sROR, 2.73[95%CI, 2.43–3.06]), ischemic heart disease (sROR, 3.31[95%CI, 2.65–4.13] vs sROR, 1.98 [95%CI, 1.24‐3.16] vs sROR, 3.00 [95%CI, 2.46‐3.65]), embolic and thrombotic (sROR, 3.36 [95%CI, 3.04‐3.72] vs sROR, 2.16 [95%CI, 1.70‐2.74] vs sROR, 5.25 [95%CI, 4.82‐5.72]). Both ranibizumab and bevacizumab produced hypertension (sROR, 1.73 [95%CI, 1.41‐2.12] vs sROR, 1.46 [95%CI, 1.03‐2.06]) and arrhythmias (sROR, 2.82 [95%CI, 1.99‐3.99] vs sROR, 2.13 [95%CI, 1.08‐4.22]) signals. The signals of heart failure were detected in ranibizumab (sROR, 5.64 [95%CI, 4.08‐7.79]) and aflibercept (sROR, 2.80 [95%CI, 2.03‐3.86]). Ranibizumab, bevacizumab, and aflibercept for ocular disease have different safety profiles in cardiovascular and cerebrovascular. The overall cardiovascular and cerebrovascular risk of the patient should be thoroughly assessed in order to select the safest drug for treatment. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. Federal funding allocation on HIV/AIDS research in the United States (2008–2018): an exploratory study using Big Data.
- Author
-
Lyu, Tianchu, Qiao, Shan, Hair, Nicole, Liang, Chen, and Li, Xiaoming
- Subjects
- *
HIV infections , *RESEARCH , *CENSUS , *POPULATION geography , *ENDOWMENT of research , *GOVERNMENT aid , *ECONOMIC aspects of diseases , *DATA analytics , *FINANCIAL management , *STATISTICAL models , *HEALTH equity , *AIDS , *DATA mining - Abstract
Literature suggests that federal funding allocation for HIV-related research in the US may not align with HIV disease burden but is influenced by structural disparities. This study sought to examine how federal funding allocation is associated with HIV disease burden and research capacity of states by applying Big Data integration, text mining, and statistics. Using text mining, we identified 20,678 HIV-related federal projects from 2008 to 2018 in NIH ExPORTER, which were then integrated with data from AtlasPlus and US Census Bureau. We developed Gini coefficients to assess the inequality of funding and the Generalized Estimating Equations model to examine the associations between funding allocation and (1) state HIV disease burden, (2) state research capacity, and (3) geographic regions, respectively. The Gini coefficients (0.60 to 0.80) suggest a highly skewed funding distribution. Funding allocation was not associated with state HIV disease burden (p = 0.269) but HIV research capacity (p = 0.000). The South (with the heaviest HIV disease burden) did not receive significantly more federal funding. Our findings for the first time identified disparities of federal funding allocation, suggesting that federal agencies favor states of high research capacity over heavy disease burden, which may reinforce the HIV-related health disparities. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Hyperparameter optimization for cardiovascular disease data-driven prognostic system.
- Author
-
Saputra, Jayson, Lawrencya, Cindy, Saini, Jecky Mitra, and Suharjito, Suharjito
- Subjects
DATA mining software ,DATA mining ,CARDIOVASCULAR diseases ,CLASSIFICATION ,SUPPORT vector machines ,K-nearest neighbor classification ,K-means clustering ,GREY relational analysis - Abstract
Prediction and diagnosis of cardiovascular diseases (CVDs) based, among other things, on medical examinations and patient symptoms are the biggest challenges in medicine. About 17.9 million people die from CVDs annually, accounting for 31% of all deaths worldwide. With a timely prognosis and thorough consideration of the patient's medical history and lifestyle, it is possible to predict CVDs and take preventive measures to eliminate or control this life-threatening disease. In this study, we used various patient datasets from a major hospital in the United States as prognostic factors for CVD. The data was obtained by monitoring a total of 918 patients whose criteria for adults were 28-77 years old. In this study, we present a data mining modeling approach to analyze the performance, classification accuracy and number of clusters on Cardiovascular Disease Prognostic datasets in unsupervised machine learning (ML) using the Orange data mining software. Various techniques are then used to classify the model parameters, such as k-nearest neighbors, support vector machine, random forest, artificial neural network (ANN), naïve bayes, logistic regression, stochastic gradient descent (SGD), and AdaBoost. To determine the number of clusters, various unsupervised ML clustering methods were used, such as k-means, hierarchical, and density-based spatial clustering of applications with noise clustering. The results showed that the best model performance analysis and classification accuracy were SGD and ANN, both of which had a high score of 0.900 on Cardiovascular Disease Prognostic datasets. Based on the results of most clustering methods, such as k-means and hierarchical clustering, Cardiovascular Disease Prognostic datasets can be divided into two clusters. The prognostic accuracy of CVD depends on the accuracy of the proposed model in determining the diagnostic model. The more accurate the model, the better it can predict which patients are at risk for CVD. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Application of text mining to the development and validation of a geographic search filter to facilitate evidence retrieval in Ovid MEDLINE: An example from the United States.
- Author
-
Cheung, Antoinette, Popoff, Evan, and Szabo, Shelagh M.
- Subjects
- *
ABSTRACTING , *SUBJECT headings , *DATABASE searching , *BIBLIOGRAPHY , *CITATION analysis , *COMPARATIVE studies , *BIBLIOGRAPHICAL citations , *INFORMATION science , *INFORMATION retrieval , *DESCRIPTIVE statistics , *MEDLINE , *STATISTICAL sampling , *SENSITIVITY & specificity (Statistics) , *DATA mining , *MEDICAL research , *ALGORITHMS - Abstract
Background: Given the increasing volume of published research in bibliographic databases, efficient retrieval of evidence is crucial and represents an opportunity to integrate novel techniques such as text mining. Objectives: To develop and validate a geographic search filter for identifying research from the United States (US) in Ovid MEDLINE. Methods: US and non‐US citations were collected from bibliographies of evidence‐based reviews. Citations were partitioned by US/non‐US status and randomly divided to a training and testing set. Using text mining, common one‐ and two‐word terms in title/abstract fields were identified, and frequencies compared between US/non‐US citations. Results: Common US‐related terms included (as ratio of frequency in US/non‐US citations) US populations and geographic terms [e.g., 'Americans' (15.5), 'Baltimore' (20.0)]. Common non‐US terms were non‐US geographic terms [e.g., 'Japan' (0.04), 'French' (0.05)]. A search filter was developed with 98.3% sensitivity and 82.7% specificity. Discussion: This search filter will streamline the identification of evidence from the US. Periodic updates may be necessary to reflect changes in MEDLINE's controlled vocabulary. Conclusion: Text mining was instrumental to the development of this search filter. A novel technique generated a gold standard set comprising >20,000 citations. This method may be adapted to develop subsequent geographic search filters. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. Data-Driven Analysis of Employee Churn in the Home Care Industry.
- Author
-
Vergnolle, Guillaume and Lahrichi, Nadia
- Subjects
- *
WORK environment , *SHIFT systems , *DECISION trees , *HOME care services , *AGE distribution , *MACHINE learning , *RANDOM forest algorithms , *LABOR turnover , *CONTRACTS , *EMPLOYEES' workload , *JOB satisfaction , *DESCRIPTIVE statistics , *RESEARCH funding , *INTENTION , *LOGISTIC regression analysis , *ARTIFICIAL neural networks , *DATA mining , *EMPLOYEE retention - Abstract
Annual turnover of home care workers represents a huge loss of revenue and is a key source of inefficiency in the home health care industry. In this article, we propose a data-driven approach to monitor employee churn and to capture the evolution of employee intent to leave. Unlike most papers in the literature, we use machine learning techniques to analyze over 2 million visits in the US, Canada, and Australia between 2016 and 2019. Results show that the gap between the number of hours worked and in the contract is the most important factor to predict employee intent to leave, which means an employee should be given as many hours as requested in the contract to improve retention. Secondary results show that having diverse shift lengths and continuity in services and patients seem to be associated with less turnover. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
22. A database of synthetic inelastic neutron scattering spectra from molecules and crystals.
- Author
-
Cheng, Yongqiang, Stone, Matthew B., and Ramirez-Cuesta, Anibal J.
- Subjects
INELASTIC neutron scattering ,DATABASES ,MOLECULAR spectra ,CRYSTALS ,DATA mining - Abstract
Inelastic neutron scattering (INS) is a powerful tool to study the vibrational dynamics in a material. The analysis and interpretation of the INS spectra, however, are often nontrivial. Unlike diffraction, for which one can quickly calculate the scattering pattern from the structure, the calculation of INS spectra from the structure involves multiple steps requiring significant experience and computational resources. To overcome this barrier, a database of INS spectra consisting of commonly seen materials will be a valuable reference, and it will also lay the foundation of advanced data-driven analysis and interpretation of INS spectra. Here we report such a database compiled for over 20,000 organic molecules and over 10,000 inorganic crystals. The INS spectra are obtained from a streamlined workflow, and the synthetic INS spectra are also verified by available experimental data. The database is expected to greatly facilitate INS data analysis, and it can also enable the utilization of advanced analytics such as data mining and machine learning. Notice: This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Standing on the shoulders of giants: Online formative assessments as the foundation for predictive learning analytics models.
- Author
-
Bulut, Okan, Gorgun, Guher, Yildirim‐Erbasli, Seyma N., Wongvorachan, Tarid, Daniels, Lia M., Gao, Yizhu, Lai, Ka Wing, and Shin, Jinnie
- Subjects
- *
LEARNING Management System , *EDUCATIONAL technology , *FORMATIVE tests , *DATA mining , *ACADEMIC achievement , *YOUNG adults , *HIGHER education - Abstract
As universities around the world have begun to use learning management systems (LMSs), more learning data have become available to gain deeper insights into students' learning processes and make data‐driven decisions to improve student learning. With the availability of rich data extracted from the LMS, researchers have turned much of their attention to learning analytics (LA) applications using educational data mining techniques. Numerous LA models have been proposed to predict student achievement in university courses. To design predictive LA models, researchers often follow a data‐driven approach that prioritizes prediction accuracy while sacrificing theoretical links to learning theory and its pedagogical implications. In this study, we argue that instead of complex variables (e.g., event logs, clickstream data, timestamps of learning activities), data extracted from online formative assessments should be the starting point for building predictive LA models. Using the LMS data from multiple offerings of an asynchronous undergraduate course, we analysed the utility of online formative assessments in predicting students' final course performance. Our findings showed that the features extracted from online formative assessments (e.g., completion, timestamps and scores) served as strong and significant predictors of students' final course performance. Scores from online formative assessments were consistently the strongest predictor of student performance across the three sections of the course. The number of clicks in the LMS and the time difference between first access and due dates of formative assessments were also significant predictors. Overall, our findings emphasize the need for online formative assessments to build predictive LA models informed by theory and learning design. Practitioner notesWhat is already known about this topic Higher education institutions often use learning analytics for the early identification of low‐performing students or students at risk of dropping out.Most predictive models in learning analytics rely on immutable student characteristics (e.g., gender, race and socioeconomic status) and complex variables extracted from log data within a learning management system.Prioritizing prediction accuracy without theory orientation often yields "black‐box" models that fail to inform educators on what remedies need to be taken to improve student learning.What this paper adds Predictive models in learning analytics should consider learning theory, pedagogy and learning design to identify key predictors of student learning.Online formative assessments can be a starting point for building predictive models that are not only accurate but also provide educators with actionable insights on how student learning can be improved.Time‐related and score‐related features extracted from online formative assessments are particularly useful for predicting students' course performance.Implications for practice and/or policy This study provides strong evidence for using online formative assessments as the foundation for predictive models in learning analytics.Student data from online formative assessments can help educators provide students with feedback while informing future formative assessment cycles.Higher education institutions should avoid the hype around complex data from learning management systems and instead rely on effective learning tools such as online formative assessments to revolutionize the use of learning analytics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. State Vocational Rehabilitation Service Patterns and Employment Outcomes Predictors Among Native American VR Clients.
- Author
-
Salimi, Nahal, Gere, Bryan, Dallas, Bryan, and Shahab, Amin
- Subjects
- *
NATIVE Americans , *EMPLOYMENT of people with disabilities , *ARTIFICIAL intelligence , *EMPLOYMENT , *DESCRIPTIVE statistics , *VOCATIONAL rehabilitation , *DATA analysis software , *EDUCATIONAL attainment , *DATA mining - Abstract
Purpose. To investigate the demographic pattern and predictors for successful employment outcomes among Native American (NA) clients in public vocational rehabilitation programs between the program years (PY) 2017-2019 in the United States. Method. The researchers completed descriptive analyses to provide a breakdown of the demographic variables, primary and secondary disability, source of referral, and medical coverage. We also examined the factors related to barriers to employment for Native Americans as well as the most frequent VR employment services used and the effectiveness of each for successful employment. Lastly, we used the Exhaustive CHAID data mining method to determine which of the VR services could be the strongest predictors to successful employment for Native Americans. Results. Results showed that in total 77.7% of individuals who received rehabilitation technology were employed after leaving the program. [ABSTRACT FROM AUTHOR]
- Published
- 2023
25. Security Access Control Method for Wind-Power-Monitoring System Based on Agile Authentication Mechanism.
- Author
-
Shu, Yingli, Yuan, Quande, Ke, Wende, and Kou, Lei
- Subjects
ACCESS control ,ONLINE monitoring systems ,WIND turbines ,DATA encryption ,WIND power ,ELECTRIC power distribution grids ,DATA mining - Abstract
With the continuous increase in the proportion of wind power construction and grid connection, the deployment scale of state sensors in wind-power-monitoring systems has grown rapidly with an aim on the problems that the communication authentication process between the wind turbine status sensor and the monitoring gateway is complex and the adaptability of the massive sensors is insufficient. A security access control method for a wind-power-monitoring system based on agile authentication mechanism is proposed in this paper. First, a lightweight key generation algorithm based on one-way hash function is designed. The algorithm realizes fixed-length compression and encryption of measurement data of any length. Under the condition of ensuring security, the calculation and communication cost in the later stage of authentication are effectively reduced. Second, to reduce the redundant process of wind turbine status sensor authentication, an agile authentication model of wind turbine status sensor based on a lightweight key is constructed. Constrained by the reverse order extraction of key information in the lightweight keychain, the model can realize lightweight communication between massive wind turbine status sensors and regional gateways. Finally, the proposed method is compared and verified using the wind turbine detection data set provided by the National New Energy Laboratory of the United States. The experimental results show that this method can effectively reduce the certification cost of a wind-power-monitoring system. Additionally, it can improve the efficiency of status sensor identity authentication and realize the agility and efficiency of the authentication process. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. The Data Science Opportunity: Crafting a Holistic Strategy.
- Author
-
Maxwell, Dan, Norton, Hannah, and Wu, Joe
- Subjects
- *
DATA libraries , *DATA science , *LIBRARIES & state , *PUBLIC libraries , *DATA mining - Abstract
The rise of data-driven research and discovery may be one of the greatest strategic opportunities to confront academic libraries in a generation. The argument advanced in this article is that the data science opportunity is about data curation AND data analysis. Thus the development of a holistic data science strategy ought to include both elements. Up until now, academic libraries have largely responded to the data science opportunity from a curatorial and archiving perspective. However, this is beginning to change. The case for crafting a holistic data science strategy is presented in six parts in this article. In part one, a broad overview of the data science opportunity is presented, followed by a definition of data analysis and data curation in part two. The traditional academic library response (curation) and a reframing of it to include data analysis are then presented in two separate parts. And finally, part five reports findings from a recent survey conducted at the University of Florida (UF) which indicates robust demand for training in analytical tools and technologies. The article concludes with some thoughts on the challenges of offering data analysis services, using the UF experience to highlight key issues. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
27. SMALL BUSINESS OWNERS' AGREEABLENESS AND INFLUENCE ON THE FINANCIAL RISK FACED BY BANKS THROUGH THE APPLICATION OF DISTRIBUTED GRAPH-BASED DATA MINING IN THE UNITED STATES.
- Author
-
Kasztelnik, Karina and Moncayo, Luis
- Subjects
FINANCIAL risk ,DATA mining ,AGREEABLENESS ,SMALL business ,MICROFINANCE ,FINANCIAL institutions - Abstract
The primary objective of this experimental research study is to investigate unique small business owners' personality traits, and the influence of their agreeableness on the financial risk faced by banks through the application of distributed graph-based data mining in the United States. We use the parallel coordinators based distributed graphical model to find out the hidden patterns in the input data. Prior studies only found negative significantly correlated agreeableness with the microloans having lower risk measurement. Our studies found both positive significantly correlated agreeableness with the microloans having high risk measurement for the group participants falling within the age range of 36-55 years, and negative significantly correlated agreeableness with the microloans having lower risk measurement for the group participants falling within the age range of 36-55 years. The additional novel findings of our study are that, while we understand that most of the participants to whom we sent the survey did not have the microloans yet, they can function as good stable candidates for the secured microloans per our experimentally unique graphical trends analysis. We discovered that the people who have microloans tend to largely have the following characteristics: usually warm, friendly, and tactful, between 36 and 55 years of age, female, and white. Thus, banks should search for microloan candidates with similar new characteristic to be sure that they improve the quality of bank risk and loans do not fail in their financial asset's portfolio. The distributed graph analytics shows more accuracy with prescriptive analytics when compared to the traditional statistical approach. These findings can contribute to improving bank risk to build more stronger financial assets with lower bank risk for our financial institutions around the World and the modern new data analytics. [ABSTRACT FROM AUTHOR]
- Published
- 2022
28. Safety Profile of Selective Serotonin Reuptake Inhibitors in Real-World Settings: A Pharmacovigilance Study Based on FDA Adverse Event Reporting System.
- Author
-
Zhao Y, Zhang Y, Yang L, Zhang K, and Li S
- Subjects
- Humans, United States, Retrospective Studies, Male, Adult, Female, Middle Aged, Aged, Young Adult, Adolescent, Databases, Factual, Child, Selective Serotonin Reuptake Inhibitors adverse effects, Pharmacovigilance, Adverse Drug Reaction Reporting Systems statistics & numerical data, United States Food and Drug Administration
- Abstract
Background: Selective serotonin reuptake inhibitors (SSRIs) are the most frequently prescribed agents to treat depression. Considering the growth in antidepressant prescription rates, SSRI-induced adverse events (AEs) need to be comprehensively clarified., Objective: This study was to investigate safety profiles and potential AEs associated with SSRIs using the Food and Drug Administration Adverse Event Reporting System (FAERS)., Methods: A retrospective pharmacovigilance analysis was conducted using the FAERS database, with Open Vigil 2.1 used for data extraction. The study included cases from the marketing date of each SSRI (ie, citalopram, escitalopram, fluoxetine, paroxetine, fluvoxamine, and sertraline) to April 30, 2023. We employed the reporting odds ratio and Bayesian confidence propagation neural network as analytical tools to assess the association between SSRIs and AEs. The Medical Dictionary for Regulatory Activities was used to standardize the definition of AEs. AE classification was achieved using system organ classes (SOCs)., Results: Overall, 427 655 AE reports were identified for the 6 SSRIs, primarily associated with 25 SOCs, including psychiatric, nervous system, congenital, familial, genetic, cardiac, and reproductive disorders. Notably, sertraline ( n = 967) and fluvoxamine ( n = 169) exhibited the highest and lowest signal frequencies, respectively. All SSRIs had relatively strong signals related to congenital, psychiatric, and nervous disorders., Conclusions and Relevance: Most of our findings are consistent with those reported previously, but some AEs were not previously identified. However, AEs attributed to SSRIs remain ambiguous, warranting further validation. Applying data-mining methods to the FAERS database can provide additional insights that can assist in appropriately utilizing SSRIs., Competing Interests: Declaration of Conflicting InterestsThe authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
- Published
- 2024
- Full Text
- View/download PDF
29. Post-marketing safety concerns with rimegepant based on a pharmacovigilance study.
- Author
-
Hu JL, Wu JY, Xu S, Qian SY, Jiang C, and Zheng GQ
- Subjects
- Humans, Male, Female, Adult, Middle Aged, Young Adult, Adolescent, Aged, Retrospective Studies, Child, Product Surveillance, Postmarketing statistics & numerical data, United States epidemiology, Child, Preschool, Piperidines adverse effects, Infant, United States Food and Drug Administration, Aged, 80 and over, Drug-Related Side Effects and Adverse Reactions epidemiology, Pharmacovigilance, Adverse Drug Reaction Reporting Systems statistics & numerical data
- Abstract
Purpose: This study aimed to comprehensively assess the safety of rimegepant administration in real-world clinical settings., Methods: Data from the Food and Drug Administration Adverse Event Reporting System (FAERS) spanning the second quarter of 2020 through the first quarter of 2023 were retrospectively analyzed in this pharmacovigilance investigation. This study focuses on employing subgroup analysis to monitor rimegepant drug safety. Descriptive analysis was employed to examine clinical characteristics and concomitant medication of adverse event reports associated with rimegepant, including report season, reporter country, sex, age, weight, dose, and frequency, onset time, et al. Correlation analysis, including techniques such as violin plots, was utilized to explore relationships between clinical characteristics in greater detail. Additionally, four disproportionality analysis methods were applied to assess adverse event signals associated with rimegepant., Results: A total of 5,416,969 adverse event reports extracted from the FAERS database, 10, 194 adverse events were identified as the "primary suspect" (PS) drug attributed to rimegepant. Rimegepant-associated adverse events involved 27 System Organ Classes (SOCs), and the significant SOC meeting all four detection criteria was "general disorders and administration site conditions" (SOC: 10018065). Additionally, new significant adverse events were discovered, including "vomiting projectile" (PT: 10047708), "eructation" (PT: 10015137), "motion sickness" (PT: 10027990), "feeling drunk" (PT: 10016330), "reaction to food additive" (PT: 10037977), etc. Descriptive analysis indicated that the majority of reporters were consumers (88.1%), with most reports involving female patients. Significant differences were observed between female and male patients across age categories, and the concomitant use of rimegepant with other medications was complex., Conclusion: This study has preliminarily identified potential new adverse events associated with rimegepant, such as those involving the gastrointestinal system, nervous system, and immune system, which warrant further research to determine their exact mechanisms and risk factors. Additionally, significant differences in rimegepant-related adverse events were observed across different age groups and sexes, and the complexity of concomitant medication use should be given special attention in clinical practice., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
30. Public Response to Federal Electronic Cigarette Regulations Analyzed Using Social Media Data Through Natural Language Processing: Topic Modeling Study.
- Author
-
Lin SY, Tulabandu SK, Koch JR, Hayes R, Barnes A, Purohit H, Chen S, Han B, and Xue H
- Subjects
- United States, Humans, Public Opinion, Government Regulation, Public Health legislation & jurisprudence, Social Media statistics & numerical data, Natural Language Processing, United States Food and Drug Administration, Electronic Nicotine Delivery Systems
- Abstract
Background: e-Cigarette (electronic cigarette) use has been a public health issue in the United States. On June 23, 2022, the US Food and Drug Administration (FDA) issued marketing denial orders (MDOs) to Juul Labs Inc for all their products currently marketed in the United States. However, one day later, on June 24, 2022, a federal appeals court granted a temporary reprieve to Juul Labs that allowed it to keep its e-cigarettes on the market. As the conversation around Juul continues to evolve, it is crucial to gain insights into the sentiments and opinions expressed by individuals on social media., Objective: This study aims to conduct a comprehensive analysis of tweets before and after the ban on Juul, aiming to shed light on public perceptions and sentiments surrounding this contentious topic and to better understand the life cycle of public health-related policy on social media., Methods: Natural language processing (NLP) techniques were used, including state-of-the-art BERTopic topic modeling and sentiment analysis. A total of 6023 tweets and 22,288 replies or retweets were collected from Twitter (rebranded as X in 2023) between June 2022 and October 2022. The encoded topics were used in time-trend analysis to depict the boom-and-bust cycle. Content analyses of retweets were also performed to better understand public perceptions and sentiments about this contentious topic., Results: The attention surrounding the FDA's ban on Juul lasted no longer than a week on Twitter. Not only the news (ie, tweets with a YouTube link that directs to the news site) related to the announcement itself, but the surrounding discussions (eg, potential consequences of this ban or block and concerns toward kids or youth health) diminished shortly after June 23, 2022, the date when the ban was officially announced. Although a short rebound was observed on July 4, 2022, which was contributed by the suspension on the following day, discussions dried out in 2 days. Out of the top 50 most retweeted tweets, we observed that, except for neutral (23/45, 51%) sentiment that broadcasted the announcement, posters responded more negatively (19/45, 42%) to the FDA's ban., Conclusions: We observed a short life cycle for this news announcement, with a preponderance of negative sentiment toward the FDA's ban on Juul. Policy makers could use tactics such as issuing ongoing updates and reminders about the ban, highlighting its impact on public health, and actively engaging with influential social media users who can help maintain the conversation., (©Shuo-Yu Lin, Sahithi Kiran Tulabandu, J Randy Koch, Rashelle Hayes, Andrew Barnes, Hemant Purohit, Songqing Chen, Bo Han, Hong Xue. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 01.10.2024.)
- Published
- 2024
- Full Text
- View/download PDF
31. Assessment of adverse events of the novel cardiovascular drug vericiguat: a real-world pharmacovigilance study based on FAERS.
- Author
-
Rao J, Chen X, Liu Y, Wang X, Cheng P, and Wang Z
- Subjects
- Humans, United States, Data Mining, Cardiovascular Agents adverse effects, Cardiovascular Agents administration & dosage, Male, Female, Middle Aged, Adult, Pharmacovigilance, Adverse Drug Reaction Reporting Systems statistics & numerical data, Bayes Theorem, United States Food and Drug Administration
- Abstract
Background: This study aims to analyze the adverse event reports (AERs) to vericiguat using data from the Food and Drug Administration Adverse Event Reporting System (FAERS) and provide evidence for the clinical use., Methods: AERs due to vericiguat from 2021Q1 to 2024Q1 identified as the primary suspect were screened, with duplicate reports subsequently eliminated. Various quantitative signal detection methods, including reporting odds ratio (ROR), proportional reporting ratio (PRR), Bayesian confidence propagation neural network, and multi-item gamma poisson shrinker, were then employed for data mining and analysis. Signal strength is represented by the 95% confidence interval, information component (IC), and empirical Bayesian geometric mean (EBGM)., Results: A total of 617 vericiguat-related AERs were identified. Strong signals were observed in 21 system organ classes. Furthermore, the most frequently reported preferred terms (PT) was hypotension ( n = 86, ROR 25.92, PRR 24.11, IC 4.59, EBGM 24.07), followed by dizziness ( n = 52, ROR 6.44, PRR 6.20, IC 2.63, EBGM 6.20), malaise ( n = 25, ROR 3.59, PRR 3.54, IC 1.82, EBGM 3.54), blood pressure decreased ( n = 23, ROR 20.00, PRR 19.64, IC 4.29, EBGM 19.61), and anemia ( n = 21, ROR 6.67, PRR 6.57, IC 2.72, EBGM 6.57)., Conclusions: This study extended the adverse reactions documented in the FDA instruction and provided supplementary evidence regarding the clinical safety of vericiguat.
- Published
- 2024
- Full Text
- View/download PDF
32. Hematological adverse events associated with anti-MRSA agents: a real-world analysis based on FAERS.
- Author
-
Yu X, Zhou X, Li M, and Zhao Y
- Subjects
- Humans, Middle Aged, Male, Female, Aged, Adult, United States, Daptomycin adverse effects, Daptomycin administration & dosage, Dose-Response Relationship, Drug, Staphylococcal Infections drug therapy, United States Food and Drug Administration, Bayes Theorem, Young Adult, Time Factors, Anti-Bacterial Agents adverse effects, Anti-Bacterial Agents administration & dosage, Adverse Drug Reaction Reporting Systems statistics & numerical data, Linezolid adverse effects, Linezolid administration & dosage, Tigecycline adverse effects, Tigecycline administration & dosage, Methicillin-Resistant Staphylococcus aureus isolation & purification, Methicillin-Resistant Staphylococcus aureus drug effects, Hematologic Diseases chemically induced, Vancomycin adverse effects, Vancomycin administration & dosage
- Abstract
This study investigated the patterns of hematological adverse events related to daptomycin (DAP), tigecycline (TIG), vancomycin (VAN) and linezolid (LIN) in the FDA Adverse Event Reporting System (FAERS). Adverse event associations were analyzed through calculating reporting odds ratio (ROR), proportional reporting ratio (PRR), multiple gamma Poisson shrinkage (MGPS), and Bayesian confidence propagation neural network (BCPNN). A comprehensive descriptive analysis was also conducted considering factors such as age, gender, daily dose, cumulative dose, and time to onset. The leading hematologic adverse events were eosinophilia for daptomycin, coagulation abnormalities and thrombocytopenia for tigecycline, thrombocytopenia, neutropenia, and anemia for linezolid, and thrombocytopenia, eosinophilia, and neutropenia for vancomycin. Most of the affected patients were over 55 years old. Daily doses for the tigecycline and daptomycin groups exceeded the standard daily dose. The times to onset were 14.00 days for daptomycin (interquartile range [IQR], 4.00-21.00), 6.00 days for tigecycline (IQR, 2.00-9.00), 10.00 days for linezolid (IQR, 4.00-16.5), and 10.00 days for vancomycin (IQR,5.00-20.00). It is essential to intensify early monitoring and identification of these adverse events, especially in the context of off-label dosages and for elderly patients and individuals taking medication for over one week.
- Published
- 2024
- Full Text
- View/download PDF
33. Safety assessment of Tafamidis: a real-world pharmacovigilance study of FDA adverse event reporting system (FAERS) events.
- Author
-
Li Y, Sun S, Wu H, Zhao L, and Peng W
- Subjects
- Humans, United States, Male, Adult, Middle Aged, Female, Aged, Young Adult, Adolescent, Databases, Factual, Child, Retrospective Studies, Child, Preschool, Infant, Aged, 80 and over, Drug-Related Side Effects and Adverse Reactions epidemiology, Benzoxazoles, Pharmacovigilance, Adverse Drug Reaction Reporting Systems statistics & numerical data, United States Food and Drug Administration
- Abstract
Objective: Tafamidis-associated adverse events (AEs) were investigated retrospectively by data mining the US Food and Drug Administration Adverse Event Reporting System (FAERS) to inform clinical safety., Methods: Data were gathered from the FAERS database, which spans the second quarter of 2019 to the fourth quarter of 2023. A total number of 8532 reports of Tafamidis-related adverse events were detected after evaluating 8,432,351 data. Disproportionality analyses were used to quantify the signal and assess the significance of Tafamidis-associated AEs using four algorithms, including the reporting odds ratio (ROR), the proportional reporting ratio (PRR), the multi-item gamma Poisson shrinker (MGPS) and the Bayesian confidence propagation neural network (BCPNN)., Results: Among the 8532 reports of AEs with Tafamidis as the primary suspected drug, Tafamidis-induced AEs were identified as occurring in 27 system organ classes (SOC). A total of 207 Tafamidis-induced AEs were detected which simultaneously complied with the four algorithms. Our analysis also identified new adverse reactions including Hypoacusis, Deafness, and Essential hypertension. The median onset of adverse reactions associated with Tafamidis was 180 days (interquartile range [IQR] 51-419 days)., Conclusion: Tafamidis is a drug that has shown favorable safety and tolerability results in clinical trials. However, a number of adverse reactions associated with Tafamidis have been identified through analysis of the FAERS database. In clinical applications, it is recommended to closely monitor patients' hearing while using Tafamidis. In addition, it is hoped that further experimental and clinical studies will be conducted in the future to understand the mechanism of occurrence between Tafamidis and adverse reactions such as primary hypertension, hyperlipidemia, and height reduction., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
34. NIH ends funding for key parasitology database.
- Author
-
Wadman M
- Subjects
- Humans, Data Mining, Financing, Government, Research Support as Topic, United States, Databases, Factual economics, Malaria economics, National Institutes of Health (U.S.) economics, Parasitology economics
- Abstract
Trove of data-mining resources on malaria and other killers will need donations to stay alive.
- Published
- 2024
- Full Text
- View/download PDF
35. A quantitative content analysis of topical characteristics of the online COVID-19 infodemic in the United States and Japan.
- Author
-
Seah M and Iwakuma M
- Subjects
- Humans, Japan, United States epidemiology, Communication, Social Media statistics & numerical data, SARS-CoV-2, Data Mining, COVID-19 epidemiology
- Abstract
Background: The COVID-19 pandemic has spurred the growth of a global infodemic. In order to combat the COVID-19 infodemic, it is necessary to understand what kinds of misinformation are spreading. Furthermore, various local factors influence how the infodemic manifests in different countries. Therefore, understanding how and why infodemics differ between countries is a matter of interest for public health. This study aims to elucidate and compare the types of COVID-19 misinformation produced from the infodemic in the US and Japan., Methods: COVID-19 fact-checking articles were obtained from the two largest publishers of fact-checking articles in each language. 1,743 US articles and 148 Japanese articles in their respective languages were gathered, with articles published between 23 January 2020 and 4 November 2022. Articles were analyzed using the free text mining software KH Coder. Exploration of frequently-occurring words and groups of related words was carried out. Based on agglomeration plots and prior research, eight categories of misinformation were created. Lastly, coding rules were created for these eight categories, and a chi-squared test was performed to compare the two datasets., Results: Overall, the most frequent words in both languages were related to health-related terms, but the Japan dataset had more words referring to foreign countries. Among the eight categories, differences with chi-squared p ≤ 0.01 were found after Holm-Bonferroni p value adjustment for the proportions of misinformation regarding statistics (US 40.0% vs. JP 25.7%, ϕ 0.0792); origin of the virus and resultant discrimination (US 7.0% vs. JP 20.3%, ϕ 0.1311); and COVID-19 disease severity, treatment, or testing (US 32.6% vs. JP 45.9%, ϕ 0.0756)., Conclusions: Local contextual factors were found that likely influenced the infodemic in both countries; representations of these factors include societal polarization in the US and the HPV vaccine scare in Japan. It is possible that Japan's relative resistance to misinformation affects the kinds of misinformation consumed, directing attention away from conspiracy theories and towards health-related issues. However, more studies need to be done to verify whether misinformation resistance affects misinformation consumption patterns this way., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
36. Assessment of EMR ML Mining Methods for Measuring Association between Metal Mixture and Mortality for Hypertension.
- Author
-
Xu S and Sun M
- Subjects
- Humans, Female, Middle Aged, Male, Risk Assessment, United States epidemiology, Risk Factors, Electronic Health Records, Aged, Predictive Value of Tests, Adult, Prognosis, Metals, Heavy blood, Metals, Heavy urine, Metals, Heavy adverse effects, Data Mining, Nutrition Surveys, Hypertension mortality, Hypertension physiopathology, Hypertension diagnosis, Machine Learning
- Abstract
Introduction: There are limited data available regarding the connection between heavy metal exposure and mortality among hypertension patients., Aim: We intend to establish an interpretable machine learning (ML) model with high efficiency and robustness that monitors mortality based on heavy metal exposure among hypertension patients., Methods: Our datasets were obtained from the US National Health and Nutrition Examination Survey (NHANES, 2013-2018). We developed 5 ML models for mortality prediction among hypertension patients by heavy metal exposure, and tested them by 10 discrimination characteristics. Further, we chose the optimally performing model after parameter adjustment by genetic algorithm (GA) for prediction. Finally, in order to visualize the model's ability to make decisions, we used SHapley Additive exPlanation (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) algorithm to illustrate the features. The study included 2347 participants in total., Results: A best-performing eXtreme Gradient Boosting (XGB) with GA for mortality prediction among hypertension patients by 13 heavy metals was selected (AUC 0.959; 95% CI 0.953-0.965; accuracy 96.8%). According to sum of SHAP values, cadmium (0.094), cobalt (2.048), lead (1.12), tungsten (0.129) in urine, and lead (2.026), mercury (1.703) in blood positively influenced the model, while barium (- 0.001), molybdenum (- 2.066), antimony (- 0.398), tin (- 0.498), thallium (- 2.297) in urine, and selenium (- 0.842), manganese (- 1.193) in blood negatively influenced the model., Conclusions: Hypertension patients' mortality associated with heavy metal exposure was predicted by an efficient, robust, and interpretable GA-XGB model with SHAP and LIME. Cadmium, cobalt, lead, tungsten in urine, and mercury in blood are positively correlated with mortality, while barium, molybdenum, antimony, tin, thallium in urine, and lead, selenium, manganese in blood is negatively correlated with mortality., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
37. Empowering Research With the American Heart Association Get With The Guidelines Registries Through Integration of a Database and Research Tools.
- Author
-
Beon C, Wang L, Manchanda V, Mallya P, Hong H, Picotte H, Thomas K, Hall JL, Zhao J, and Feng X
- Subjects
- Humans, United States, User-Computer Interface, Biomedical Research, Guideline Adherence standards, Cardiovascular Diseases therapy, Cardiovascular Diseases diagnosis, Cardiovascular Diseases epidemiology, Data Curation, Quality Indicators, Health Care standards, Data Mining, Access to Information, Health Information Interoperability, Registries, American Heart Association, Databases, Factual, Practice Guidelines as Topic
- Abstract
Background: The American Heart Association's Get With The Guidelines (GWTG) has emerged as a vital resource in advancing the standards and practices of inpatient care across stroke, heart failure, coronary artery disease, atrial fibrillation, and resuscitation focus areas. The GWTG registry data have also created new opportunities for secondary use of real-world clinical data in biomedical research. Our goal was to implement a scalable database with an integrated user interface (UI) to improve GWTG data management and accessibility., Methods: The curation of registry data begins by going through a data processing and quality control pipeline programmed in Python. This pipeline includes data cleaning and record exclusion, variable derivation and unit harmonization, limited data set preparation, and documentation generation of the registry data. The database was built using PostgreSQL, and integrations between the database and the UI were built using the Django Web Framework in Python. Smaller subsets of data were created using SQLite database files for distribution purposes. Use cases of these tools are provided in the article., Results: We implemented an automated data curation pipeline, centralized database, and UI application for the American Heart Association GWTG registry data. The database and the UI are accessible through a Precision Medicine Platform workspace. As of March 2024, the database contains over 13.2 million cleaned GWTG patient records. The SQLite subsets benefit researchers by optimizing data extraction and manipulation using Structured Query Language. The UI improves accessibility for nontechnical researchers by presenting data in a user-friendly tabular format with intuitive filtering options., Conclusions: With the implementation of the GWTG database and UI application, we addressed data management and accessibility concerns despite its growing scale. We have launched tools to provide streamlined access and accessibility of GWTG registry data to all researchers, regardless of familiarity or experience in coding., Competing Interests: None.
- Published
- 2024
- Full Text
- View/download PDF
38. Developing a Geospatial Framework for Severe Occupational Injuries Using Moran's I and Getis-Ord Gi* Statistics for Southeastern United States.
- Author
-
Fahad, Md. Golam Rabbani, Zech, Wesley C., Nazari, Rouzbeh, and Karimi, Maryam
- Subjects
GEOGRAPHIC information systems ,GEOSPATIAL data ,WORK-related injuries ,INDUSTRIAL safety ,WEB-based user interfaces ,DATA mining - Abstract
Occupational safety and health (OSH) related agencies have a plethora of injury data available; however, modern data visualization and user-friendly dissemination tools are lacking in this field. This work has created a geospatial platform using geographic information systems (GIS) combined with big data analytics for the southeastern region of United States. Severe injury reports were collected and visualized through state-of-the-art spatial statistics including spatial clustering, heat maps, and hotspots to identify the most vulnerable regions due to various injury types from a safety vantage point. A special focus was also given to the construction industry, considering the hazardous nature of the industry. Statistically significant spatial clustering was observed within the study region, with Moran's I index of 0.49. Hotspots for severe injuries were also identified with approximately 99% confidence level using Getis Ord Gi* statistics. Results indicated statistically significant high-risk zones particularly around city areas with spatial injury rates (SIR) up to approximately 2.84 per county. Analysis also showed increased number of severe injuries during summer months, with approximately 1,000 injuries during the month of June and July. The construction industry accounted for 20% of all injuries, with "caught-in/between" being the highest amongst the four primary causes of severe injuries, commonly known as the "fatal four." Finally, a web-based application was created to disseminate the results. Big data mining coupled with geospatial technology in OSH management can offer decision makers up-to-date and highly geospatial information for severe injuries that can be integrated in developing a comprehensive occupational safety surveillance plan. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. The Development of Mean-Variance Efficient Portfolios: 30 Years Later.
- Author
-
Chava, Sudheer and Guerard, Jr., John B.
- Subjects
PORTFOLIO management (Investments) ,PORTFOLIO managers (Investments) ,STOCK exchanges ,DATA mining - Abstract
In 1992, in the initial year of this journal's publication, Guerard and Takano reported mean-variance efficient portfolios for the Japanese and US equity markets and showed that the use of a regression-weighted composite model of earnings, book value, cash flow, sales, and their relative variables outperformed their respective equity benchmarks by approximately 400 basis points annually. Two years later, Markowitz and Xu tested the composite model strategy and found that its excess returns were statistically significant from a variety of models tested and that the composite model strategy was not the result of data mining. For the 30th anniversary issue, the authors of this article report robust regression modeling results for the 2001-2020 period using the latest features in R and the latest commercially available multi-factor models for portfolio selection. Quantitative investing requires constant implementation and discipline to maximize client wealth. The authors' results suggest that stock selection models can be effectively employed to deliver excess returns. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. Social experiences with mental health service use among US adolescents.
- Author
-
Xie, Xin, Wang, Nianyang, and Chu, Jun
- Subjects
- *
STUDENT health , *SOCIAL participation , *SOCIAL support , *SUBSTANCE abuse , *SOCIAL networks , *MULTIPLE regression analysis , *HISPANIC Americans , *DESCRIPTIVE statistics , *DISEASE prevalence , *MENTAL depression , *HEALTH insurance , *ODDS ratio , *EMOTIONS , *MENTAL health services , *RELIGION , *AFRICAN Americans , *DISEASE complications , *ADOLESCENCE - Abstract
Little is known about the associations of social experiences with mental health service use. This study aimed to classify social experiences variables in the past year and examine the associations of selected variables in social experiences with mental health service use among US adolescents. A total of 13,038 adolescents (aged 12 to 17), of which 2208 received mental health services, were from the 2018 National Survey on Drug Use and Health. Multivariate logistic regression (MLR) analysis was conducted. The overall prevalence of mental health service use was 16.1%. 44 variables on social experiences were grouped into 10 disjoint clusters and one variable from each cluster was selected for MLR analysis. Being female, African American, Hispanics, insured and having depression in the past year were associated with increased odds of mental health service use. Negative feelings about going to school, having a serious fight at school/work, active involvement in substance use help programs, knowledge of drug prevention, negative perceptions about the role of religious beliefs on life decisions were positively associated with mental health service use. Mental health service use is associated with feelings about school and peers, perceptions about drug use, and involvement in activities. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. A data mining analysis of COVID-19 cases in states of United States of America.
- Author
-
Yavuz, Özerk
- Subjects
DATA mining ,COVID-19 pandemic ,STANDARD deviations ,DATA analysis ,PANDEMICS - Abstract
Epidemic diseases can be extremely dangerous with its hazarding influences. They may have negative effects on economies, businesses, environment, humans, and workforce. In this paper, some of the factors that are interrelated with COVID-19 pandemic have been examined using data mining methodologies and approaches. As a result of the analysis some rules and insights have been discovered and performances of the data mining algorithms have been evaluated. According to the analysis results, JRip algorithmic technique had the most correct classification rate and the lowest root mean squared error (RMSE). Considering classification rate and RMSE measure, JRip can be considered as an effective method in understanding factors that are related with corona virus caused deaths. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
42. Comprehensive Radar Data for the Contiguous United States: Multi-Year Reanalysis of Remotely Sensed Storms.
- Author
-
Williams, Skylar S., Ortega, Kiel L., Smith, Travis M., and Reinhart, Anthony E.
- Subjects
- *
MACHINE learning , *THUNDERSTORMS - Abstract
The Multi-Year Reanalysis of Remotely Sensed Storms (MYRORSS) dataset blends radar data from the WSR-88D network and Near-Storm Environmental (NSE) model analyses using the Multi-Radar Multi-Sensor (MRMS) framework. The MYRORSS dataset uses the WSR-88D archive starting in 1998–2011, processing all valid single-radar volumes to produce a seamless three-dimensional reflectivity volume over the entire contiguous United States with an approximate 5-min update frequency. The three-dimensional grid has an approximate 1 km × 1 km horizontal dimension and is on a stretched vertical grid that extends to 20 km MSL with a maximal vertical spacing of 1 km. Several reflectivity-derived, severe-storm-related products are also produced, which leverage the ability to merge the MRMS and NSE data. Two Doppler velocity-derived azimuthal shear layer maximum products are produced at a higher horizontal resolution of approximately 0.5 km × 0.5 km. The initial period of record for the dataset is 1998–2011. The dataset underwent intensive manual quality control to ensure that all available and valid data were included while excluding highly problematic radar volumes that were a negligible percentage of the overall dataset, but which caused large data errors in some cases. This dataset has applications toward radar-based climatologies, postevent analysis, machine learning applications, model verification, and warning improvements. Details of the manual quality control process are included and examples of some of these applications are presented. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. Are mortgage loan closing delay risks predictable? A predictive analysis using text mining on discussion threads.
- Author
-
Goldberg, David M., Zaman, Nohel, Brahma, Arin, and Aloiso, Mariano
- Subjects
- *
DISCUSSION , *MORTGAGES , *MACHINE learning , *RISK assessment , *BENCHMARKING (Management) , *DOCUMENTATION , *PSYCHOLINGUISTICS , *INTERPROFESSIONAL relations , *DESCRIPTIVE statistics , *ENDOWMENTS , *DATA mining - Abstract
Loan processors and underwriters at mortgage firms seek to gather substantial supporting documentation to properly understand and model loan risks. In doing so, loan originations become prone to closing delays, risking client dissatisfaction and consequent revenue losses. We collaborate with a large national mortgage firm to examine the extent to which these delays are predictable, using internal discussion threads to prioritize interventions for loans most at risk. Substantial work experience is required to predict delays, and we find that even highly trained employees have difficulty predicting delays by reviewing discussion threads. We develop an array of methods to predict loan delays. We apply four modern out‐of‐the‐box sentiment analysis techniques, two dictionary‐based and two rule‐based, to predict delays. We contrast these approaches with domain‐specific approaches, including firm‐provided keyword searches and "smoke terms" derived using machine learning. Performance varies widely across sentiment approaches; while some sentiment approaches prioritize the top‐ranking records well, performance quickly declines thereafter. The firm‐provided keyword searches perform at the rate of random chance. We observe that the domain‐specific smoke term approaches consistently outperform other approaches and offer better prediction than loan and borrower characteristics. We conclude that text mining solutions would greatly assist mortgage firms in delay prevention. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. Sharing of retracted COVID-19 articles: an altmetric study.
- Author
-
Shamsi, Amrollah, Lund, Brady Daniel, and SeyyedHosseini, Shohreh
- Subjects
- *
ALTMETRICS , *PROFESSIONAL peer review , *COVID-19 , *SOCIAL media , *RESEARCH ethics , *PERIODICAL articles , *MISINFORMATION , *DATA analysis software , *IMPACT factor (Citation analysis) , *DATA mining , *WORLD Wide Web , *BLOGS - Abstract
Objective: This study examines the extent to which retracted articles pertaining to COVID-19 have been shared via social and mass media based on altmetric scores. Methods: Seventy-one retracted articles related to COVID-19 were identified from relevant databases, of which thirty-nine had an Altmetric Attention Score obtained using the Altmetrics Bookmarklet. Data extracted from the articles include overall attention score and demographics of sharers (e.g., geographic location, professional affiliation). Results: Retracted articles related to COVID-19 were shared tens of thousands of times to an audience of potentially hundreds of millions of readers and followers. Twitter was the largest medium for sharing these articles, and the United States was the country with the most sharers. While general members of the public were the largest proportion of sharers, researchers and professionals were not immune to sharing these articles on social media and on websites, blogs, or news media. Conclusions: These findings have potential implications for better understanding the spread of misleading or false information perpetuated in retracted scholarly publications. They emphasize the importance of quality peer review and research ethics among journals and responsibility among individuals who wish to share research findings. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Development of nation-wide reference intervals using an indirect method and harmonized assays.
- Author
-
Fleming, James K., Katayev, Alex, Moorer, Candace M., Ward-Jeffries, Denean A., and Terrell, Colon L.
- Subjects
- *
CHILD patients , *AGE groups , *GENDER , *CLINICAL chemistry , *PATHOLOGICAL laboratories , *BIG data , *DATA mining - Abstract
• Reference intervals were derived from LabCorp's large database of patient test results. • Reference intervals are age and gender based inclusive of neonatal and geriatric populations. • The modified indirect method of Hoffmann was used to determine the reference intervals. • The population used for the calculation was diverse and distributed across the United States. • Patient test results were pulled from a nation-wide system of laboratories. • The patient population is unbiased with respect to age, gender, race or geography. • Assays used in the calculations are standardized to instrument, method, and calibration. • The reference intervals include 266 analytes representing over 2700 age bracket. • Age brackets are calculated with significant N values utilizing 72 M test results. • Reference intervals tabulated represent multiple laboratory disciplines. For many years, clinical laboratories have either verified or estimated reference intervals (RI) for laboratory tests. Those calculations have largely been performed by direct sampling analysis of ostensibly healthy individuals or by post-analysis biochemical screening. Recently however, indirect calculations have come to the forefront as an IFCC endorsed method by using normal and abnormal patient data. Using a large database of patient test results from Laboratory Corporation of America, age and gender based RIs, inclusive of neonatal, pediatric, and geriatric populations, were determined using a modified indirect method of Hoffmann, and represent a diverse population distributed across the United States from a nation-wide system of laboratories and is unbiased with respect to age, gender, race or geography. The tabulation of RIs using big data by an indirect method represent 72 M patient test results. The table includes 266 individual analytes consisting of approximately 2,700 age categories, including tests across multiple medical disciplines. To our knowledge, this is the largest collection of RIs that were calculated by an indirect method representing clinical chemistry, endocrinology, coagulation, and hematology analytes that have been derived with very powerful "Ns" for each age bracket. This process provides more robust RIs and allows for the determination of pediatric and geriatric RIs that would otherwise be difficult to obtain using traditional direct RI determinations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. Multiple comorbid conditions and healthcare resource utilization among adult patients with hyperkalemia: A retrospective observational cohort study using association rule mining.
- Author
-
Dai, Dingwei, Sharma, Ajay, Alvarez, Paula J, and Woods, Steven D
- Subjects
HYPERTENSION epidemiology ,CHRONIC kidney failure ,LENGTH of stay in hospitals ,SCIENTIFIC observation ,HOSPITAL emergency services ,MEDICAL care costs ,RETROSPECTIVE studies ,DIABETES ,ACQUISITION of data ,MEDICAL care use ,HYPERLIPIDEMIA ,DISEASE prevalence ,HOSPITAL care ,MEDICAL records ,RESEARCH funding ,HYPERKALEMIA ,MEDICAL appointments ,COMORBIDITY ,DATA mining ,LONGITUDINAL method - Abstract
Objectives: To estimate the prevalence of specific comorbid conditions (CCs) and multiple comorbid conditions (MCCs) among adult patients with hyperkalemia and examine the associations between MCCs and healthcare resource utilization (HRU) and costs. Methods: This retrospective observational cohort study was conducted using a large administrative claims database. We identified patients with hyperkalemia (ICD-10-CM: E87.5; or serum potassium >5.0 mEq/L; or NDC codes for either patiromer or sodium polystyrene sulfonate) during the study period (1/1/2016–6/30/2019). The earliest service/claim date with evidence of hyperkalemia was identified as index date. Qualified patients had ≥12 months of enrolment before and after index date, ≥18 years of age. Comorbid conditions were assessed using all data within 12 months prior to the index date. Healthcare resource utilization and costs were estimated using all data within 12 months after the index date. Association rule mining was applied to identify MCCs. Generalized linear models were used to examine the associations between MCCs and HRU and costs. Results: Of 22,154 patients with hyperkalemia, 94% had ≥3 CCs. The most common individual CCs were chronic kidney disease (CKD, 85%), hypertension (HTN, 83%), hyperlipidemia (HLD, 81%), and diabetes mellitus (DM, 47%). The most common dyad combination of CCs was CKD+HTN (71%). The most common triad combination was CKD+HTN+HLD (62%). The most common quartet combination was CKD+HTN+HLD+DM (36%). The increased number of CCs were significantly associated with increased ED visits, length of hospital stays, and total healthcare costs (all p-value < 0.0001). Conclusions: MCCs are very prevalent among patients with hyperkalemia and are strongly associated with HRU and costs. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. How experimental and strategic are Business Intelligence (BI) and Data Mining applications?
- Author
-
Fontes Cruz, Rodrigo, Colaço Júnior, Methanias, and Menezes Gois, Victor
- Subjects
DATA mining ,METHODOLOGY ,EXPERT systems ,STRATEGIC planning ,CONFERENCES & conventions ,EVIDENCE gaps ,DATA science ,BUSINESSPEOPLE - Abstract
Copyright of Revista Ibero-Americana de Estratégia (RIAE) is the property of Revista Ibero-Americana de Estrategia/UNINOVE and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2022
- Full Text
- View/download PDF
48. Real-Time Investigation of a Large Nosocomial Influenza A Outbreak Informed by Genomic Epidemiology.
- Author
-
Javaid, Waleed, Ehni, Jordan, Gonzalez-Reiche, Ana S, Carreño, Juan Manuel, Hirsch, Elena, Tan, Jessica, Khan, Zenab, Kriti, Divya, Ly, Thanh, Kranitzky, Bethany, Barnett, Barbara, Cera, Freddy, Prespa, Lenny, Moss, Marie, Albrecht, Randy A, Mustafa, Ala, Herbison, Ilka, Hernandez, Matthew M, Pak, Theodore R, and Alshammary, Hala A
- Subjects
- *
PREVENTION of infectious disease transmission , *INFLUENZA prevention , *INFLUENZA diagnosis , *CROSS infection prevention , *PREVENTION of epidemics , *INFLUENZA epidemiology , *PUBLIC health surveillance , *INFLUENZA vaccines , *INFLUENZA A virus , *SEQUENCE analysis , *IMMUNIZATION , *PREVENTION of communicable diseases , *CROSS infection , *COMMUNITY health services , *EPIDEMICS , *GENOMICS , *URBAN health , *ELECTRONIC health records , *ALLIED health personnel , *DATA mining - Abstract
Background Nosocomial respiratory virus outbreaks represent serious public health challenges. Rapid and precise identification of cases and tracing of transmission chains is critical to end outbreaks and to inform prevention measures. Methods We combined conventional surveillance with influenza A virus (IAV) genome sequencing to identify and contain a large IAV outbreak in a metropolitan healthcare system. A total of 381 individuals, including 91 inpatients and 290 healthcare workers (HCWs), were included in the investigation. Results During a 12-day period in early 2019, infection preventionists identified 89 HCWs and 18 inpatients as cases of influenza-like illness (ILI), using an amended definition without the requirement for fever. Sequencing of IAV genomes from available nasopharyngeal specimens identified 66 individuals infected with a nearly identical strain of influenza A H1N1pdm09 (43 HCWs, 17 inpatients, and 6 with unspecified affiliation). All HCWs infected with the outbreak strain had received the seasonal influenza virus vaccination. Characterization of 5 representative outbreak viral isolates did not show antigenic drift. In conjunction with IAV genome sequencing, mining of electronic records pinpointed the origin of the outbreak as a single patient and a few interactions in the emergency department that occurred 1 day prior to the index ILI cluster. Conclusions We used precision surveillance to delineate a large nosocomial IAV outbreak, mapping the source of the outbreak to a single patient rather than HCWs as initially assumed based on conventional epidemiology. These findings have important ramifications for more-effective prevention strategies to curb nosocomial respiratory virus outbreaks. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
49. Prostaglandin analogues signal detection by data mining in the FDA Adverse Event Reporting System database.
- Author
-
Contreras-Salinas H, Romero-López MS, Olvera-Montaño O, and Rodríguez-Herrera LY
- Subjects
- Humans, United States epidemiology, Prostaglandins, Synthetic adverse effects, Antihypertensive Agents adverse effects, Ophthalmic Solutions adverse effects, Adverse Drug Reaction Reporting Systems statistics & numerical data, Data Mining, United States Food and Drug Administration, Databases, Factual
- Abstract
Objective: This study aims to identify safety signals of ophthalmic prostaglandin analogues through data mining the Food and Drug Administration Adverse Event Reporting System (FAERS) database., Methods: A data mining search by proportional reporting ratio, reporting OR, Bayesian confidence propagation neural network, information component 0.25 and χ
2 for safety signals detection was conducted to the FAERS database for the following ophthalmic medications: latanoprost, travoprost, tafluprost and bimatoprost., Results: 12 preferred terms were statistically associated: diabetes mellitus, n=2; hypoacusis, n=2; malignant mediastinal neoplasm, n=1; blood immunoglobulin E increased, n=1; cataract, n=1; blepharospasm, n=1; full blood count abnormal, n=1; skin exfoliation, n=1; chest discomfort, n=1; and dry mouth, n=1., Limitation of the Study: The FAERS database's limitations, such as the undetermined causality of cases, under-reporting and the lack of restriction to only health professionals reporting this type of event, could modify the statistical outcomes. These limitations are particularly relevant in the context of ophthalmic drug analysis, as they can affect the accuracy and reliability of the data, potentially leading to biased or incomplete results., Conclusions: Our findings have revealed a potential relationship due to the biological plausibility among malignant mediastinal neoplasm, full blood count abnormal, blood immunoglobulin E increased, diabetes mellitus, blepharospasm, cataracts, chest discomfort and dry mouth; therefore, it is relevant to continue investigating the possible drug-event association, whether to refute the safety signal or identify a new risk., Competing Interests: Competing interests: Laboratorios Sophia provided support in the form of salaries for authors (HC-S, MSR-L, OOM, LYR-H), but did not have any additional role., (© Author(s) (or their employer(s)) 2024. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.)- Published
- 2024
- Full Text
- View/download PDF
50. Bioinformatics-guided disproportionality analysis of sevoflurane-induced nephrogenic diabetes insipidus using the FDA Adverse Event Reporting System database.
- Author
-
Jacob AT, Kumar AH, Halivana G, Lukose L, Nair G, and Subeesh V
- Subjects
- Humans, Male, United States epidemiology, Female, Middle Aged, Adult, Data Mining, Adolescent, Aged, Young Adult, Child, Incidence, Child, Preschool, Infant, Sevoflurane adverse effects, Adverse Drug Reaction Reporting Systems statistics & numerical data, Molecular Docking Simulation, Anesthetics, Inhalation adverse effects, United States Food and Drug Administration, Diabetes Insipidus, Nephrogenic chemically induced, Diabetes Insipidus, Nephrogenic epidemiology, Databases, Factual statistics & numerical data, Computational Biology
- Abstract
Aims: Sevoflurane is an ether-based inhalational anaesthetic that induces and maintains general anaesthesia. Our study aimed to detect sevoflurane-induced nephrogenic diabetes insipidus using data mining algorithms (DMAs) and molecular docking. The FAERS database was analysed using OpenVigil 2.1 for disproportionality analysis., Methods: We analysed FAERS data from 2004 to 2022 to determine the incidence of nephrogenic diabetes insipidus associated with sevoflurane. Reporting odds ratios (RORs) and proportional reporting ratios (PRRs) with 95% confidence intervals were calculated. We also used molecular docking with AutoDock Vina to examine sevoflurane's binding affinity to relevant receptors., Results: A total of 554 nephrogenic diabetes insipidus cases were reported in FAERS, of which 2.5% (14 cases) were associated with sevoflurane. Positive signals were observed for sevoflurane with ROR of 76.012 (95% CI: 44.67-129.35) and PRR of 75.72 (χ
2 : 934.688). Of the 14 cases, 50% required hospitalization, 14% resulted in death, and the remaining cases were categorized as other outcomes. Molecular docking analysis showed that sevoflurane exhibited high binding affinity towards AQP2 (4NEF) and AVPR2 (6U1N) with docking scores of -4.9 and -5.3, respectively., Conclusions: Sevoflurane use is significantly associated with the incidence of nephrogenic diabetes insipidus. Healthcare professionals should be cautious when using this medication and report any adverse events to regulatory agencies. Further research is needed to validate these findings and identify risk factors while performing statistical adjustments to prevent false-positives. Clinical monitoring is crucial to validate potential adverse effects of sevoflurane., (© 2023 The Authors. British Journal of Clinical Pharmacology published by John Wiley & Sons Ltd on behalf of British Pharmacological Society.)- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.