563 results on '"*NATURAL language processing"'
Search Results
2. Shortcut Learning of Large Language Models in Natural Language Understanding.
- Author
-
MENGNAN DU, FENGXIANG HE, NA ZOU, DACHENG TAO, and XIA HU
- Subjects
- *
LANGUAGE models , *NATURAL language processing , *ARTIFICIAL intelligence , *MACHINE learning , *ALGORITHMS , *INDUCTION (Logic) - Abstract
The article looks at the use of large language models to carry out natural language understanding (NLU) tasks. It suggests that the shortcut learning common to existing large language models based on machine learning limits how robust their performance can be because they are overly dependent on spurious correlations and incidental relationships. It discusses possible approaches to overcoming this problem in the future development of large language models.
- Published
- 2024
- Full Text
- View/download PDF
3. Building an NLP based speech recognition technology for emergency call centers.
- Author
-
Erukala, Sudarshan, Reddy, Prabhakar, Ramesh, Oruganti, Ramesh, Nagaram, Kumar, Atul, Prabhanjan, Bonthala, and Bolukonda, Prashanth
- Subjects
- *
SPEECH perception , *ARTIFICIAL neural networks , *LANGUAGE models , *CALL centers , *GAUSSIAN mixture models , *NATURAL language processing , *AUTOMATIC speech recognition - Abstract
The approaches of automated speech identification for spoken conversations in emergencies call centres were explored and compared therefore in research. These methodology included acoustic and linguistic models, as well as labelling techniques. Currently present speech recognition algorithms perform poorly because contact centre discussion speech has special context and is spoken in loud, emotional contexts. Consequently, the primary components of speaker verification designs and acoustical training methodologies—as well such Various investigations and analyses of symmetrical information labelling methods were performed. Various variations of Deep Neural Network/Hidden Markov Model (DNN/ HMM) and Gaussian Mixture Model/Hidden Markov Model (GMM/HMM) approaches might have been implemented and tested in order to establish an efficient language framework for conversation information. Furthermore, useful conversation system language models developed Using intrinsic and extrinsic criteria, outlined Finally, when these recommended information labelling techniques with spelling correction are compared with typical labelling techniques, they dominate the other methodologies by a significant proportion. Using the investigation's findings as a guide, we found Showed the use of spelling adjustments prior to training information for a labelling approach, trigram with Kneser-Ney discounting for a language model, and DNN/HMM for an acoustic model are efficient setups for conversation voice recognition in emergency call centres. In order to be clear, this study was Done using two distinct datasets that were gathered from emergency calls: the Dialogue dataset (27 h), which comprises the speech of the call agents, and the Summary dataset (53 h), which contains spoken summaries of those conversations summarising emergency situations. Even if the remarks were taken from the Our strategies are loosely related to particular linguistic aspects despite the fact that the emergency contact centre is in the Turkic language family of Azerbaijani, which is spoken there. As a result, it is expected that the recommended ways will also work with the other languages in the same family. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. A Computational Inflection for Scientific Discovery.
- Author
-
HOPE, TOM, DOWNEY, DOUG, ETZIONI, OREN, WELD, DANIEL S., and HORVITZ, ERIC
- Subjects
- *
SCIENTIFIC knowledge , *LANGUAGE models , *SCIENTIFIC method , *ARTIFICIAL intelligence , *INFORMATION retrieval , *NATURAL language processing , *COGNITION , *HUMAN-artificial intelligence interaction - Abstract
This article presents an overview on task-guided scientific knowledge retrieval as a way for researchers to overcome the limitations of human cognitive capacity that in the age of explosive digital information creates a cognitive bottleneck. Topics include prototypes of task-guided scientific knowledge retrieval, as well as a look at novel representations, tools, and services and a review of systems that aid researchers in all aspects of scientific inquiry and discovery.
- Published
- 2023
- Full Text
- View/download PDF
5. Defining suffering in pain: a systematic review on pain-related suffering using natural language processing.
- Author
-
Noe-Steinmüller, Niklas, Scherbakov, Dmitry, Zhuravlyova, Alexandra, Wager, Tor D., Goldstein, Pavel, and Tesarz, Jonas
- Subjects
- *
NATURAL language processing , *LANGUAGE models , *ARTIFICIAL intelligence , *SUFFERING , *MACHINE learning - Abstract
Supplemental Digital Content is Available in the Text. Understanding, measuring, and mitigating pain-related suffering is a key challenge for both clinical care and pain research. However, there is no consensus on what exactly the concept of pain-related suffering includes, and it is often not precisely operationalized in empirical studies. Here, we (1) systematically review the conceptualization of pain-related suffering in the existing literature, (2) develop a definition and a conceptual framework, and (3) use machine learning to cross-validate the results. We identified 111 articles in a systematic search of Web of Science, PubMed, PsychINFO, and PhilPapers for peer-reviewed articles containing conceptual contributions about the experience of pain-related suffering. We developed a new procedure for extracting and synthesizing study information based on the cross-validation of qualitative analysis with an artificial intelligence–based approach grounded in large language models and topic modeling. We derived a definition from the literature that is representative of current theoretical views and describes pain-related suffering as a severely negative, complex, and dynamic experience in response to a perceived threat to an individual's integrity as a self and identity as a person. We also offer a conceptual framework of pain-related suffering distinguishing 8 dimensions: social, physical, personal, spiritual, existential, cultural, cognitive, and affective. Our data show that pain-related suffering is a multidimensional phenomenon that is closely related to but distinct from pain itself. The present analysis provides a roadmap for further theoretical and empirical development. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. End-to-end pseudonymization of fine-tuned clinical BERT models: Privacy preservation with maintained data utility.
- Author
-
Vakili, Thomas, Henriksson, Aron, and Dalianis, Hercules
- Subjects
- *
LANGUAGE models , *DATA privacy , *PRIVACY , *NATURAL language processing - Abstract
Many state-of-the-art results in natural language processing (NLP) rely on large pre-trained language models (PLMs). These models consist of large amounts of parameters that are tuned using vast amounts of training data. These factors cause the models to memorize parts of their training data, making them vulnerable to various privacy attacks. This is cause for concern, especially when these models are applied in the clinical domain, where data are very sensitive. Training data pseudonymization is a privacy-preserving technique that aims to mitigate these problems. This technique automatically identifies and replaces sensitive entities with realistic but non-sensitive surrogates. Pseudonymization has yielded promising results in previous studies. However, no previous study has applied pseudonymization to both the pre-training data of PLMs and the fine-tuning data used to solve clinical NLP tasks. This study evaluates the effects on the predictive performance of end-to-end pseudonymization of Swedish clinical BERT models fine-tuned for five clinical NLP tasks. A large number of statistical tests are performed, revealing minimal harm to performance when using pseudonymized fine-tuning data. The results also find no deterioration from end-to-end pseudonymization of pre-training and fine-tuning data. These results demonstrate that pseudonymizing training data to reduce privacy risks can be done without harming data utility for training PLMs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. A BERT-based pretraining model for extracting molecular structural information from a SMILES sequence.
- Author
-
Zheng, Xiaofan and Tomiura, Yoichi
- Subjects
- *
ARTIFICIAL neural networks , *LANGUAGE models , *SMILING , *MOLECULAR structure , *MACHINE learning , *NATURAL language processing - Abstract
Among the various molecular properties and their combinations, it is a costly process to obtain the desired molecular properties through theory or experiment. Using machine learning to analyze molecular structure features and to predict molecular properties is a potentially efficient alternative for accelerating the prediction of molecular properties. In this study, we analyze molecular properties through the molecular structure from the perspective of machine learning. We use SMILES sequences as inputs to an artificial neural network in extracting molecular structural features and predicting molecular properties. A SMILES sequence comprises symbols representing molecular structures. To address the problem that a SMILES sequence is different from actual molecular structural data, we propose a pretraining model for a SMILES sequence based on the BERT model, which is widely used in natural language processing, such that the model learns to extract the molecular structural information contained in the SMILES sequence. In an experiment, we first pretrain the proposed model with 100,000 SMILES sequences and then use the pretrained model to predict molecular properties on 22 data sets and the odor characteristics of molecules (98 types of odor descriptor). The experimental results show that our proposed pretraining model effectively improves the performance of molecular property prediction Scientific contribution: The 2-encoder pretraining is proposed by focusing on the lower dependency of symbols to the contextual environment in a SMILES than one in a natural language sentence and the corresponding of one compound to multiple SMILES sequences. The model pretrained with 2-encoder shows higher robustness in tasks of molecular properties prediction compared to BERT which is adept at natural language. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Title and abstract screening for literature reviews using large language models: an exploratory study in the biomedical domain.
- Author
-
Dennstädt, Fabio, Zink, Johannes, Putora, Paul Martin, Hastings, Janna, and Cihoric, Nikola
- Subjects
- *
LANGUAGE models , *MEDICAL literature , *LITERATURE reviews , *TECHNOLOGICAL innovations , *LIKERT scale , *NATURAL language processing - Abstract
Background: Systematically screening published literature to determine the relevant publications to synthesize in a review is a time-consuming and difficult task. Large language models (LLMs) are an emerging technology with promising capabilities for the automation of language-related tasks that may be useful for such a purpose. Methods: LLMs were used as part of an automated system to evaluate the relevance of publications to a certain topic based on defined criteria and based on the title and abstract of each publication. A Python script was created to generate structured prompts consisting of text strings for instruction, title, abstract, and relevant criteria to be provided to an LLM. The relevance of a publication was evaluated by the LLM on a Likert scale (low relevance to high relevance). By specifying a threshold, different classifiers for inclusion/exclusion of publications could then be defined. The approach was used with four different openly available LLMs on ten published data sets of biomedical literature reviews and on a newly human-created data set for a hypothetical new systematic literature review. Results: The performance of the classifiers varied depending on the LLM being used and on the data set analyzed. Regarding sensitivity/specificity, the classifiers yielded 94.48%/31.78% for the FlanT5 model, 97.58%/19.12% for the OpenHermes-NeuralChat model, 81.93%/75.19% for the Mixtral model and 97.58%/38.34% for the Platypus 2 model on the ten published data sets. The same classifiers yielded 100% sensitivity at a specificity of 12.58%, 4.54%, 62.47%, and 24.74% on the newly created data set. Changing the standard settings of the approach (minor adaption of instruction prompt and/or changing the range of the Likert scale from 1–5 to 1–10) had a considerable impact on the performance. Conclusions: LLMs can be used to evaluate the relevance of scientific publications to a certain review topic and classifiers based on such an approach show some promising results. To date, little is known about how well such systems would perform if used prospectively when conducting systematic literature reviews and what further implications this might have. However, it is likely that in the future researchers will increasingly use LLMs for evaluating and classifying scientific publications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Extracting Systemic Anticancer Therapy and Response Information From Clinical Notes Following the RECIST Definition.
- Author
-
Zuo, Xu, Kumar, Ashok, Shen, Shuhan, Li, Jianfu, Cong, Grace, Jin, Edward, Chen, Qingxia, Warner, Jeremy L., Yang, Ping, and Xu, Hua
- Subjects
- *
TREATMENT effectiveness , *LANGUAGE models , *NATURAL language processing , *ELECTRONIC health records , *DATA mining , *CANCER treatment - Abstract
PURPOSE: The RECIST guidelines provide a standardized approach for evaluating the response of cancer to treatment, allowing for consistent comparison of treatment efficacy across different therapies and patients. However, collecting such information from electronic health records manually can be extremely labor-intensive and time-consuming because of the complexity and volume of clinical notes. The aim of this study is to apply natural language processing (NLP) techniques to automate this process, minimizing manual data collection efforts, and improving the consistency and reliability of the results. METHODS: We proposed a complex, hybrid NLP system that automates the process of extracting, linking, and summarizing anticancer therapy and associated RECIST-like responses from narrative clinical text. The system consists of multiple machine learning–/deep learning–based and rule-based modules for diverse NLP tasks such as named entity recognition, assertion classification, relation extraction, and text normalization, to address different challenges associated with anticancer therapy and response information extraction. We then evaluated the system performances on two independent test sets from different institutions to demonstrate its effectiveness and generalizability. RESULTS: The system used domain-specific language models, BioBERT and BioClinicalBERT, for high-performance therapy mentions identification and RECIST responses extraction and categorization. The best-performing model achieved a 0.66 score in linking therapy and RECIST response mentions, with end-to-end performance peaking at 0.74 after relation normalization, indicating substantial efficacy with room for improvement. CONCLUSION: We developed, implemented, and tested an information extraction system from clinical notes for cancer treatment and efficacy assessment information. We expect this system will support future cancer research, particularly oncologic studies that focus on efficiently assessing the effectiveness and reliability of cancer therapeutics. Extracting systemic anticancer therapy and RECIST response information from clinical notes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. TaeC: A manually annotated text dataset for trait and phenotype extraction and entity linking in wheat breeding literature.
- Author
-
Nédellec, Claire, Sauvion, Clara, Bossy, Robert, Borovikova, Mariya, and Deléger, Louise
- Subjects
- *
WHEAT breeding , *NATURAL language processing , *BIOLOGICAL classification , *LANGUAGE models , *SCIENTIFIC literature , *PHENOTYPES , *WHEAT - Abstract
Wheat varieties show a large diversity of traits and phenotypes. Linking them to genetic variability is essential for shorter and more efficient wheat breeding programs. A growing number of plant molecular information networks provide interlinked interoperable data to support the discovery of gene-phenotype interactions. A large body of scientific literature and observational data obtained in-field and under controlled conditions document wheat breeding experiments. The cross-referencing of this complementary information is essential. Text from databases and scientific publications has been identified early on as a relevant source of information. However, the wide variety of terms used to refer to traits and phenotype values makes it difficult to find and cross-reference the textual information, e.g. simple dictionary lookup methods miss relevant terms. Corpora with manually annotated examples are thus needed to evaluate and train textual information extraction methods. While several corpora contain annotations of human and animal phenotypes, no corpus is available for plant traits. This hinders the evaluation of text mining-based crop knowledge graphs (e.g. AgroLD, KnetMiner, WheatIS-FAIDARE) and limits the ability to train machine learning methods and improve the quality of information. The Triticum aestivum trait Corpus is a new gold standard for traits and phenotypes of wheat. It consists of 528 PubMed references that are fully annotated by trait, phenotype, and species. We address the interoperability challenge of crossing sparse assay data and publications by using the Wheat Trait and Phenotype Ontology to normalize trait mentions and the species taxonomy of the National Center for Biotechnology Information to normalize species. The paper describes the construction of the corpus. A study of the performance of state-of-the-art language models for both named entity recognition and linking tasks trained on the corpus shows that it is suitable for training and evaluation. This corpus is currently the most comprehensive manually annotated corpus for natural language processing studies on crop phenotype information from the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Text-to-video generative artificial intelligence: sora in neurosurgery.
- Author
-
Mohamed, Ali A. and Lucke-Wold, Brandon
- Subjects
- *
GENERATIVE artificial intelligence , *LANGUAGE models , *NATURAL language processing , *COMPUTER vision , *ARTIFICIAL intelligence - Abstract
Artificial intelligence (AI) has increased in popularity in neurosurgery, with recent interest in generative AI algorithms such as the Large Language Model (LLM) ChatGPT. Sora, an innovation in generative AI, leverages natural language processing, deep learning, and computer vision to generate impressive videos from text prompts. This new tool has many potential applications in neurosurgery. These include patient education, public health, surgical training and planning, and research dissemination. However, there are considerable limitations to the current model such as physically implausible motion generation, spontaneous generation of subjects, unnatural object morphing, inaccurate physical interactions, and abnormal behavior presentation when many subjects are generated. Other typical concerns are with respect to patient privacy, bias, and ethics. Further, appropriate investigation is required to determine how effective generative videos are compared to their non-generated counterparts, irrespective of any limitations. Despite these challenges, Sora and other iterations of its text-to-video generative application may have many benefits to the neurosurgical community. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Enhancing post-traumatic stress disorder patient assessment: leveraging natural language processing for research of domain criteria identification using electronic medical records.
- Author
-
Miranda, Oshin, Kiehl, Sophie Marie, Qi, Xiguang, Brannock, M. Daniel, Kosten, Thomas, Ryan, Neal David, Kirisci, Levent, Wang, Yanshan, and Wang, LiRong
- Subjects
- *
NATURAL language processing , *POST-traumatic stress disorder , *ELECTRONIC health records , *LANGUAGE models , *MEDICAL needs assessment , *IDENTIFICATION , *MEDICAL record databases - Abstract
Background: Extracting research of domain criteria (RDoC) from high-risk populations like those with post-traumatic stress disorder (PTSD) is crucial for positive mental health improvements and policy enhancements. The intricacies of collecting, integrating, and effectively leveraging clinical notes for this purpose introduce complexities. Methods: In our study, we created a natural language processing (NLP) workflow to analyze electronic medical record (EMR) data and identify and extract research of domain criteria using a pre-trained transformer-based natural language model, all-mpnet-base-v2. We subsequently built dictionaries from 100,000 clinical notes and analyzed 5.67 million clinical notes from 38,807 PTSD patients from the University of Pittsburgh Medical Center. Subsequently, we showcased the significance of our approach by extracting and visualizing RDoC information in two use cases: (i) across multiple patient populations and (ii) throughout various disease trajectories. Results: The sentence transformer model demonstrated high F1 macro scores across all RDoC domains, achieving the highest performance with a cosine similarity threshold value of 0.3. This ensured an F1 score of at least 80% across all RDoC domains. The study revealed consistent reductions in all six RDoC domains among PTSD patients after psychotherapy. We found that 60.6% of PTSD women have at least one abnormal instance of the six RDoC domains as compared to PTSD men (51.3%), with 45.1% of PTSD women with higher levels of sensorimotor disturbances compared to men (41.3%). We also found that 57.3% of PTSD patients have at least one abnormal instance of the six RDoC domains based on our records. Also, veterans had the higher abnormalities of negative and positive valence systems (60% and 51.9% of veterans respectively) compared to non-veterans (59.1% and 49.2% respectively). The domains following first diagnoses of PTSD were associated with heightened cue reactivity to trauma, suicide, alcohol, and substance consumption. Conclusions: The findings provide initial insights into RDoC functioning in different populations and disease trajectories. Natural language processing proves valuable for capturing real-time, context dependent RDoC instances from extensive clinical notes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. A Survey on Evaluation of Large Language Models.
- Author
-
YUPENG CHANG, XU WANG, JIN DONG WANG, YUAN WU, LINYI YANG, KAIJ IE ZHU, HAO CHEN, XIAOYUAN YI, CUNXIANG WANG, YIDONG WANG, WEI YE, YUE ZHANG, YI CHANG, YU, PHILIP S., QIANG YANG, and XING XIE
- Subjects
- *
LANGUAGE models , *NATURAL language processing , *TASK analysis , *RESEARCH personnel , *EVALUATION methodology - Abstract
Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes increasingly critical, not only at the task level, but also at the society level for better understanding of their potential risks. Over the past years, significant efforts have been made to examine LLMs from various perspectives. This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate, where to evaluate, and how to evaluate. Firstly, we provide an overview from the perspective of evaluation tasks, encompassing general natural language processing tasks, reasoning, medical usage, ethics, education, natural and social sciences, agent applications, and other areas. Secondly, we answer the 'where' and 'how' questions by diving into the evaluation methods and benchmarks, which serve as crucial components in assessing the performance of LLMs. Then, we summarize the success and failure cases of LLMs in different tasks. Finally, we shed light on several future challenges that lie ahead in LLMs evaluation. Our aim is to offer invaluable insights to researchers in the realm of LLMs evaluation, thereby aiding the development of more proficient LLMs. Our key point is that evaluation should be treated as an essential discipline to better assist the development of LLMs. We consistently maintain the related open-source materials at: https://github.com/MLGroupJLU/LLM-eval-survey [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Measuring Implicit Bias in ICU Notes Using Word-Embedding Neural Network Models.
- Author
-
Cobert, Julien, Mills, Hunter, Lee, Albert, Gologorskaya, Oksana, Espejo, Edie, Jeon, Sun Young, Boscardin, W. John, Heintz, Timothy A., Kennedy, Christopher J., Ashana, Deepshikha C., Chapman, Allyson Cook, Raghunathan, Karthik, Smith, Alex K., and Lee, Sei J.
- Subjects
- *
ARTIFICIAL neural networks , *IMPLICIT bias , *NATURAL language processing , *LANGUAGE models , *MEDICAL protocols - Abstract
Language in nonmedical data sets is known to transmit human-like biases when used in natural language processing (NLP) algorithms that can reinforce disparities. It is unclear if NLP algorithms of medical notes could lead to similar transmissions of biases. Can we identify implicit bias in clinical notes, and are biases stable across time and geography? To determine whether different racial and ethnic descriptors are similar contextually to stigmatizing language in ICU notes and whether these relationships are stable across time and geography, we identified notes on critically ill adults admitted to the University of California, San Francisco (UCSF), from 2012 through 2022 and to Beth Israel Deaconess Hospital (BIDMC) from 2001 through 2012. Because word meaning is derived largely from context, we trained unsupervised word-embedding algorithms to measure the similarity (cosine similarity) quantitatively of the context between a racial or ethnic descriptor (eg, African-American) and a stigmatizing target word (eg, nonco - operative) or group of words (violence , passivity , noncompliance , nonadherence). In UCSF notes, Black descriptors were less likely to be similar contextually to violent words compared with White descriptors. Contrastingly, in BIDMC notes, Black descriptors were more likely to be similar contextually to violent words compared with White descriptors. The UCSF data set also showed that Black descriptors were more similar contextually to passivity and noncompliance words compared with Latinx descriptors. Implicit bias is identifiable in ICU notes. Racial and ethnic group descriptors carry different contextual relationships to stigmatizing words, depending on when and where notes were written. Because NLP models seem able to transmit implicit bias from training data, use of NLP algorithms in clinical prediction could reinforce disparities. Active debiasing strategies may be necessary to achieve algorithmic fairness when using language models in clinical research. [Display omitted] [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Borges and AI.
- Author
-
Raley, Rita and Samolsky, Russell
- Subjects
- *
ARTIFICIAL intelligence , *GENERATIVE artificial intelligence , *NATURAL language processing , *LANGUAGE models , *BEREAVEMENT , *ZENO'S paradoxes - Abstract
The article "Borges and AI" explores the connection between the writings of Jorge Luis Borges and artificial intelligence (AI). It discusses how Borges's story "Borges and I" foreshadows poststructuralist theories of writing and the rise of large language models (LLMs). The article delves into the themes of fictional capture and the potential for AI to surpass human creativity. It also examines the implications of AI for creative and critical writers, highlighting the challenges and uncertainties it brings. The text considers different perspectives on the relationship between human authors and generative AI models, as well as the impact of AI on education and student writing. The authors stress the importance of human involvement in textual analysis and the preservation of human authorship in the face of advancing AI technology. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
16. "Don't Ban AI from Your Writing Classroom; Require It!".
- Author
-
Hayles, N. Katherine
- Subjects
- *
ARTIFICIAL intelligence , *LANGUAGE models , *NATURAL language processing , *STUDENT cheating , *COLLEGE students , *CITATION networks - Abstract
The article discusses the use of OpenAI's ChatGPT, a large language model, in college and university writing classrooms. While some educators are concerned about students using AI to pass off their work as their own, the author argues that instead of banning AI, institutions should embrace it as a tool to accelerate student learning. The author suggests designing assignments that encourage students to develop critical relationships with algorithmic cultures and to transparently show their contributions versus what the AI contributed. The article emphasizes the importance of process-oriented assignments, collaboration, and intellectual honesty in evaluating student learning. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
17. Are Large Language Models Literary Critics?
- Author
-
Koul, Radhika
- Subjects
- *
LANGUAGE models , *CRITICS , *NATURAL language processing , *STANDARD language , *HUMAN behavior , *EMPATHY , *SITUATIONAL awareness - Abstract
This article examines the potential of large language models (LLMs) to function as literary critics. It discusses the ability of LLMs to generate human-like text and its implications for understanding thought processes. The article also explores the concept of literary reasoning, which involves high-level analogical reasoning and theory of mind. Recent research suggests that LLMs exhibit similar content-based "mistakes" in logical reasoning as humans, indicating their capacity for abstract reasoning influenced by subject matter. Additionally, studies indicate that LLMs have acquired the ability for analogical reasoning and problem-solving. The article further discusses the potential for LLMs to demonstrate theory of mind and literary reasoning, highlighting the importance of situational awareness and understanding narratives. The author conducted a study using a specific LLM, GPT-3.5-Turbo, to assess the affective appeal of fictional and nonfictional story excerpts, finding a small "fiction effect" and "embedding effect." The essay suggests that future research could incorporate literary stimuli to evaluate theory of mind in LLMs. However, further studies are needed to fully comprehend the literary reasoning capabilities of LLMs. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
18. Shakespeare Machine: New AI-Based Technologies for Textual Analysis.
- Author
-
Ehrett, Carl, Ghita, Lucian, Ranwala, Dillon, and Menezes, Alison
- Subjects
- *
LANGUAGE models , *NATURAL language processing , *CONTENT analysis , *METADATA , *PATTERN matching , *PATTERNS (Mathematics) , *HANDWRITING recognition (Computer science) - Abstract
This article demonstrates a method using tools from the field of Natural Language Processing (NLP) to aid in analyzing theatrical texts and similar works. The method deploys pre-trained large language model neural networks to gather metadata for a text that is amenable to downstream statistical analyses surfacing patterns of interest in character dialogue. We specifically focus on Shakespeare's works, collecting metadata in the form of sentiment and emotion scores for each line of his plays. In addition to sentiment and emotion scores produced by NLP models, we also directly gather metadata such as genre, line length, and character gender. We show how these metadata may be used to illuminate a number of interesting patterns in Shakespearean character which may be difficult to detect from a direct reading of the texts. We use these metadata to expose statistically significant relationships in Shakespeare between character gender and the emotional content of that character's dialogue, controlling for genre. We also present here the publicly available dataset that we have compiled to perform these analyses. The data collects text from Shakespeare's plays along with a variety of metadata useful for this and other forms of analysis of Shakespeare's works. The methodology demonstrated here may be extended to other varieties of metadata provided by large NLP models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Implicit Stance Detection with Hashtag Semantic Enrichment.
- Author
-
Dong, Li, Su, Zinao, Fu, Xianghua, Zhang, Bowen, and Dai, Genan
- Subjects
- *
LANGUAGE models , *MICROBLOGS , *NATURAL language processing , *SOCIAL media , *REPRESENTATIONS of graphs , *SOCIAL computing , *INFORMATION retrieval - Abstract
Stance detection is a crucial task in natural language processing and social computing, focusing on classifying expressed attitudes towards specific targets based on the input text. Conventional methods predominantly view stance detection as a task of target-oriented, sentence-level text classification. On popular social media platforms like Twitter, users often express their opinions through hashtags in addition to textual content within tweets. However, current methods primarily treat hashtags as data retrieval labels, neglecting to effectively utilize the semantic information they carry. In this paper, we propose a large language model knowledge-enhanced stance detection framework (LKESD) for stance detection. LKESD contains three main components: an instruction-prompted background knowledge acquisition module (IPBKA) that retrieves background knowledge of hashtags by providing handcrafted prompts to large language models (LLMs); a graph convolutional feature-enhancement module (GCFEM) is designed to extract the semantic representations of words that frequently co-occur with hashtags in the dataset by leveraging textual associations; an a knowledge fusion network (KFN) is proposed to selectively integrate graph representations and LLM features using a prompt-tuning framework. Extensive experimental results on three benchmark datasets demonstrate that our LKESD method outperforms 2.7% on all setups over compared methods, validating its effectiveness in stance detection tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Evaluating the performance of Generative Pre-trained Transformer-4 (GPT-4) in standardizing radiology reports.
- Author
-
Hasani, Amir M., Singh, Shiva, Zahergivar, Aryan, Ryan, Beth, Nethala, Daniel, Bravomontenegro, Gabriela, Mendhiratta, Neil, Ball, Mark, Farhadi, Faraz, and Malayeri, Ashkan
- Subjects
- *
GENERATIVE pre-trained transformers , *LANGUAGE models , *NATURAL language processing , *ARTIFICIAL intelligence , *RADIOLOGY , *AORTIC valve insufficiency , *ADRENAL insufficiency - Abstract
Objective: Radiology reporting is an essential component of clinical diagnosis and decision-making. With the advent of advanced artificial intelligence (AI) models like GPT-4 (Generative Pre-trained Transformer 4), there is growing interest in evaluating their potential for optimizing or generating radiology reports. This study aimed to compare the quality and content of radiologist-generated and GPT-4 AI-generated radiology reports. Methods: A comparative study design was employed in the study, where a total of 100 anonymized radiology reports were randomly selected and analyzed. Each report was processed by GPT-4, resulting in the generation of a corresponding AI-generated report. Quantitative and qualitative analysis techniques were utilized to assess similarities and differences between the two sets of reports. Results: The AI-generated reports showed comparable quality to radiologist-generated reports in most categories. Significant differences were observed in clarity (p = 0.027), ease of understanding (p = 0.023), and structure (p = 0.050), favoring the AI-generated reports. AI-generated reports were more concise, with 34.53 fewer words and 174.22 fewer characters on average, but had greater variability in sentence length. Content similarity was high, with an average Cosine Similarity of 0.85, Sequence Matcher Similarity of 0.52, BLEU Score of 0.5008, and BERTScore F1 of 0.8775. Conclusion: The results of this proof-of-concept study suggest that GPT-4 can be a reliable tool for generating standardized radiology reports, offering potential benefits such as improved efficiency, better communication, and simplified data extraction and analysis. However, limitations and ethical implications must be addressed to ensure the safe and effective implementation of this technology in clinical practice. Clinical relevance statement: The findings of this study suggest that GPT-4 (Generative Pre-trained Transformer 4), an advanced AI model, has the potential to significantly contribute to the standardization and optimization of radiology reporting, offering improved efficiency and communication in clinical practice. Key Points: • Large language model–generated radiology reports exhibited high content similarity and moderate structural resemblance to radiologist-generated reports. • Performance metrics highlighted the strong matching of word selection and order, as well as high semantic similarity between AI and radiologist-generated reports. • Large language model demonstrated potential for generating standardized radiology reports, improving efficiency and communication in clinical settings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Generative artificial intelligence, co‐evolution, and language education.
- Author
-
Thorne, Steven L.
- Subjects
- *
GENERATIVE artificial intelligence , *NATURAL language processing , *LANGUAGE models , *GEMINI (Chatbot) , *COEVOLUTION , *SECOND language acquisition , *EMPATHY - Abstract
This article explores the role of technology in language education, particularly in the context of remote instruction during the COVID-19 pandemic. It discusses the use of various technologies such as video conferencing, social media, language tutorial websites, and translation tools in language teaching and learning. The article also examines the historical development of tools and the impact of technology on human anatomy and cognitive abilities. It further explores the concept of cultures-of-use, which refers to the diverse ways individuals and communities utilize digital tools. The article concludes by discussing the potential implications of generative artificial intelligence (GenAI) in language education, highlighting both the benefits and challenges associated with its use. It emphasizes the importance of responsible and effective integration of these models into instructional contexts to support full participation and equity in education, while also recognizing the crucial role of human teachers in enhancing language learning. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
22. Has artificial intelligence rendered language teaching obsolete?
- Author
-
Handley, Zoe
- Subjects
- *
ARTIFICIAL intelligence , *AUTOMATIC speech recognition , *NATURAL language processing , *LANGUAGE teachers , *SPEECH perception , *SECOND language acquisition , *LANGUAGE models - Abstract
This article explores the potential impact of artificial intelligence (AI) on language teaching and learning. It discusses the relationship between language and technology and raises concerns about the threat AI poses to language teaching. The article also emphasizes the qualities of a good language teacher, highlighting the importance of pedagogical content knowledge and interpersonal skills. It provides an overview of AI, including its definition and various technologies like chatbots and speech recognition. The article concludes by examining whether AI can replace language teachers, using examples like Duolingo, Grammarly, and Rosetta Stone. While AI-enabled language learning tools offer benefits such as unlimited practice and feedback, they have limitations in engaging learners, providing effective feedback, and employing a wide range of pedagogical techniques. They can complement human language tutors but cannot fully replace them. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
23. Open generative AI changes a lot, but not everything.
- Author
-
Chapelle, Carol A.
- Subjects
- *
GENERATIVE artificial intelligence , *NATURAL language processing , *SPEECH synthesis , *LANGUAGE models , *COMPUTER assisted language instruction , *LEARNING strategies , *LANGUAGE ability testing - Abstract
The article explores the impact of generative artificial intelligence (AI) on language and language education. It discusses the emergence of ChatGPT, a large language model, and the debates surrounding its capabilities. Language professionals are encouraged to critically engage with AI and consider its value in language education. The article also addresses challenges in promoting cultural inquiry in language instruction and identifies potential barriers to integrating AI in language education, such as decreasing enrollments and perceptions of language teaching as too basic. It highlights the potential of AI tools to enhance language learners' experience and engagement in cultural inquiry. The article mentions a conference on AI technologies in language teaching, learning, assessment, and research, and suggests three directions for future research in applied linguistics related to generative AI. It emphasizes the responsible use of AI in language learning and the importance of teaching critical cultural awareness. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
24. ELSTM: An improved long short‐term memory network language model for sequence learning.
- Author
-
Li, Zhi, Wang, Qing, Wang, Jia‐Qiang, Qu, Han‐Bing, Dong, Jichang, and Dong, Zhi
- Subjects
- *
LANGUAGE models , *TIME complexity , *RECURRENT neural networks , *NATURAL language processing , *COMPUTATIONAL complexity - Abstract
The gated structure of the long short‐term memory (LSTM) alleviates the defects of gradient disappearance and explosion in the recurrent neural network (RNN). It has received widespread attention in sequence learning such as text analysis. Although LSTM has good performance in handling remote dependencies, information loss often occurs in long‐distance transmission. We propose a new model called ELSTM based on the computational complexity and gradient dispersion in the traditional LSTM model. This model simplifies the input gate of LSTM, reduces some time complexity by reducing some components, and improves the output gate. By introducing the exponential linear unit activation layer, the problem of gradient dispersion is alleviated. Comparing the new model with multiple existing models, when predicting language sequences, the time used by the model has been greatly reduced, and the language confusion has been reduced, showing good performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. A semi-supervised method to generate a persian dataset for suggestion classification.
- Author
-
Safari, Leila and Mohammady, Zanyar
- Subjects
- *
AUTOMATIC classification , *LANGUAGE models , *NATURAL language processing , *MACHINE learning , *LONG-term memory , *CONVOLUTIONAL neural networks - Abstract
Suggestion mining has become a popular subject in the field of natural language processing (NLP) that is useful in areas like a service/product improvement. The purpose of this study is to provide an automated machine learning (ML) based approach to extract suggestions from Persian text. In this research, first, a novel two-step semi-supervised method has been proposed to generate a Persian dataset called ParsSugg, which is then used in the automatic classification of the user's suggestions. The first step is manual labeling of data based on a proposed guideline, followed by a data augmentation phase. In the second step, using pre-trained Persian Bidirectional Encoder Representations from Transformers (ParsBERT) as a classifier and the data from the previous step, more data were labeled. The performance of various ML models, including Support Vector Machine (SVM), Random Forest (RF), Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and the ParsBERT language model has been examined on the generated dataset. The F-score value of 97.27 for ParsBERT and about 94.5 for SVM and CNN classifiers were obtained for the suggestion class which is a promising result as the first research on suggestion classification on Persian texts. Also, the proposed guideline can be used for other NLP tasks, and the generated dataset can be used in other suggestion classification tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Data augmentation strategies to improve text classification: a use case in smart cities.
- Author
-
Bencke, Luciana and Moreira, Viviane Pereira
- Subjects
- *
DATA augmentation , *SMART cities , *NATURAL language processing , *LANGUAGE models , *CLASSIFICATION algorithms - Abstract
Text classification is a very common and important task in Natural Language Processing. In many domains and real-world settings, a few labeled instances are the only resource available to train classifiers. Models trained on small datasets tend to overfit and produce inaccurate results – Data augmentation (DA) techniques come as an alternative to minimize this problem. DA generates synthetic instances that can be fed to the classification algorithm during training. In this article, we explore a variety of DA methods, including back translation, paraphrasing, and text generation. We assess the impact of the DA methods over simulated low-data scenarios using well-known public datasets in English with classifiers built fine-tuning BERT models. We describe the means to adapt these DA methods to augment a small Portuguese dataset containing tweets labeled with smart city dimensions (e.g., transportation, energy, water, etc.). Our experiments showed that some classes were noticeably improved by DA – with an improvement of 43% in terms of F1 compared to the baseline with no augmentation. In a qualitative analysis, we observed that the DA methods were able to preserve the label but failed to preserve the semantics in some cases and that generative models were able to produce high-quality synthetic instances. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Advances in machine learning with chemical language models in molecular property and reaction outcome predictions.
- Author
-
Das, Manajit, Ghosh, Ankit, and Sunoj, Raghavan B.
- Subjects
- *
LANGUAGE models , *NATURAL language processing , *CHEMICAL models , *CHEMICAL milling , *RECURRENT neural networks , *MACHINE learning - Abstract
Molecular properties and reactions form the foundation of chemical space. Over the years, innumerable molecules have been synthesized, a smaller fraction of them found immediate applications, while a larger proportion served as a testimony to creative and empirical nature of the domain of chemical science. With increasing emphasis on sustainable practices, it is desirable that a target set of molecules are synthesized preferably through a fewer empirical attempts instead of a larger library, to realize an active candidate. In this front, predictive endeavors using machine learning (ML) models built on available data acquire high timely significance. Prediction of molecular property and reaction outcome remain one of the burgeoning applications of ML in chemical science. Among several methods of encoding molecular samples for ML models, the ones that employ language like representations are gaining steady popularity. Such representations would additionally help adopt well‐developed natural language processing (NLP) models for chemical applications. Given this advantageous background, herein we describe several successful chemical applications of NLP focusing on molecular property and reaction outcome predictions. From relatively simpler recurrent neural networks (RNNs) to complex models like transformers, different network architecture have been leveraged for tasks such as de novo drug design, catalyst generation, forward and retro‐synthesis predictions. The chemical language model (CLM) provides promising avenues toward a broad range of applications in a time and cost‐effective manner. While we showcase an optimistic outlook of CLMs, attention is also placed on the persisting challenges in reaction domain, which would optimistically be addressed by advanced algorithms tailored to chemical language and with increased availability of high‐quality datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Use of Natural Language Processing to Infer Sites of Metastatic Disease From Radiology Reports at Scale.
- Author
-
Tay, See Boon, Low, Guat Hwa, Wong, Gillian Jing En, Tey, Han Jieh, Leong, Fun Loon, Li, Constance, Chua, Melvin Lee Kiang, Tan, Daniel Shao Weng, Thng, Choon Hua, Tan, Iain Bee Huat, and Tan, Ryan Shea Ying Cong
- Subjects
- *
LANGUAGE models , *MAGNETIC resonance imaging , *RADIOLOGY , *METASTASIS , *COMPUTED tomography , *NATURAL language processing - Abstract
PURPOSE: To evaluate natural language processing (NLP) methods to infer metastatic sites from radiology reports. METHODS: A set of 4,522 computed tomography (CT) reports of 550 patients with 14 types of cancer was used to fine-tune four clinical large language models (LLMs) for multilabel classification of metastatic sites. We also developed an NLP information extraction (IE) system (on the basis of named entity recognition, assertion status detection, and relation extraction) for comparison. Model performances were measured by F1 scores on test and three external validation sets. The best model was used to facilitate analysis of metastatic frequencies in a cohort study of 6,555 patients with 53,838 CT reports. RESULTS: The RadBERT, BioBERT, GatorTron-base, and GatorTron-medium LLMs achieved F1 scores of 0.84, 0.87, 0.89, and 0.91, respectively, on the test set. The IE system performed best, achieving an F1 score of 0.93. F1 scores of the IE system by individual cancer type ranged from 0.89 to 0.96. The IE system attained F1 scores of 0.89, 0.83, and 0.81, respectively, on external validation sets including additional cancer types, positron emission tomography-CT ,and magnetic resonance imaging scans, respectively. In our cohort study, we found that for colorectal cancer, liver-only metastases were higher in de novo stage IV versus recurrent patients (29.7% v 12.2%; P <.001). Conversely, lung-only metastases were more frequent in recurrent versus de novo stage IV patients (17.2% v 7.3%; P <.001). CONCLUSION: We developed an IE system that accurately infers metastatic sites in multiple primary cancers from radiology reports. It has explainable methods and performs better than some clinical LLMs. The inferred metastatic phenotypes could enhance cancer research databases and clinical trial matching, and identify potential patients for oligometastatic interventions. Ever thought of using NLP to extract sites of metastases from radiology reports? Read this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Generative design of compounds with desired potency from target protein sequences using a multimodal biochemical language model.
- Author
-
Chen, Hengwei and Bajorath, Jürgen
- Subjects
- *
LANGUAGE models , *DEEP learning , *BIOCHEMICAL models , *AMINO acid sequence , *NATURAL language processing , *MACHINE translating , *NEUROLINGUISTICS - Abstract
Deep learning models adapted from natural language processing offer new opportunities for the prediction of active compounds via machine translation of sequential molecular data representations. For example, chemical language models are often derived for compound string transformation. Moreover, given the principal versatility of language models for translating different types of textual representations, off-the-beaten-path design tasks might be explored. In this work, we have investigated generative design of active compounds with desired potency from target sequence embeddings, representing a rather provoking prediction task. Therefore, a dual-component conditional language model was designed for learning from multimodal data. It comprised a protein language model component for generating target sequence embeddings and a conditional transformer for predicting new active compounds with desired potency. To this end, the designated "biochemical" language model was trained to learn mappings of combined protein sequence and compound potency value embeddings to corresponding compounds, fine-tuned on individual activity classes not encountered during model derivation, and evaluated on compound test sets that were structurally distinct from training sets. The biochemical language model correctly reproduced known compounds with different potency for all activity classes, providing proof-of-concept for the approach. Furthermore, the conditional model consistently reproduced larger numbers of known compounds as well as more potent compounds than an unconditional model, revealing a substantial effect of potency conditioning. The biochemical language model also generated structurally diverse candidate compounds departing from both fine-tuning and test compounds. Overall, generative compound design based on potency value-conditioned target sequence embeddings yielded promising results, rendering the approach attractive for further exploration and practical applications. Scientific contribution: The approach introduced herein combines protein language model and chemical language model components, representing an advanced architecture, and is the first methodology for predicting compounds with desired potency from conditioned protein sequence data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Design of a large language model for improving customer service in telecom operators.
- Author
-
Xiaoliang, Ma, RuQiang, Zhao, Ying, Liu, Congjian, Deng, and Dequan, Du
- Subjects
- *
LANGUAGE models , *CUSTOMER services , *INFORMATION technology security , *QUALITY of service , *KNOWLEDGE base - Abstract
Telecommunications operators are tasked with enhancing service quality, reducing operational costs, and preserving customer privacy. This study presents an innovative application of large language models (LLMs) integrated with the LangChain technology framework, aimed at revolutionizing customer service in the telecom sector. The LangChain framework features a Knowledge Organizing Module and a Knowledge Search Module, both designed to refine customer support operations. The research develops an LLM‐based approach to improve the segmentation and organization of knowledge bases, tailored for the telecommunications industry. This approach ensures seamless integration with existing LLMs while preserving distinct knowledge domains, crucial for search accuracy. Additionally, the framework includes an advanced information security protocol with a robust filtering system that effectively excludes sensitive data from the model's outputs, enhancing data security. Empirical findings indicate that the ChatGLM2‐6B+LangChain model outperforms the baseline ChatGLM2, demonstrating heightened effectiveness in telecom‐specific tasks and outstripping even more sophisticated models like GPT‐4. The implementation of this LLM‐based framework within telecom customer service systems has significantly sharpened the precision of knowledge recommendations, as reflected by a dramatic increase in acceptance rates from 15% to 70%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Automating Fault Test Cases Generation and Execution for Automotive Safety Validation via NLP and HIL Simulation.
- Author
-
Amyan, Ayman, Abboush, Mohammad, Knieke, Christoph, and Rausch, Andreas
- Subjects
- *
LANGUAGE models , *SAFETY standards , *NATURAL language processing , *FAULT location (Engineering) - Abstract
The complexity and the criticality of automotive electronic implanted systems are steadily advancing and that is especially the case for automotive software development. ISO 26262 describes requirements for the development process to confirm the safety of such complex systems. Among these requirements, fault injection is a reliable technique to assess the effectiveness of safety mechanisms and verify the correct implementation of the safety requirements. However, the method of injecting the fault in the system under test in many cases is still manual and depends on an expert, requiring a high level of knowledge of the system. In complex systems, it consumes time, is difficult to execute, and takes effort, because the testers limit the fault injection experiments and inject the minimum number of possible test cases. Fault injection enables testers to identify and address potential issues with a system under test before they become actual problems. In the automotive industry, failures can have serious hazards. In these systems, it is essential to ensure that the system can operate safely even in the presence of faults. We propose an approach using natural language processing (NLP) technologies to automatically derive the fault test cases from the functional safety requirements (FSRs) and execute them automatically by hardware-in-the-loop (HIL) in real time according to the black-box concept and the ISO 26262 standard. The approach demonstrates effectiveness in automatically identifying fault injection locations and conditions, simplifying the testing process, and providing a scalable solution for various safety-critical systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Evaluating the strengths and weaknesses of large language models in answering neurophysiology questions.
- Author
-
Shojaee-Mend, Hassan, Mohebbati, Reza, Amiri, Mostafa, and Atarodi, Alireza
- Subjects
- *
LANGUAGE models , *NATURAL language processing , *GEMINI (Chatbot) , *NEUROPHYSIOLOGY , *NEUROLINGUISTICS , *PROCESS capability , *CHATGPT - Abstract
Large language models (LLMs), like ChatGPT, Google's Bard, and Anthropic's Claude, showcase remarkable natural language processing capabilities. Evaluating their proficiency in specialized domains such as neurophysiology is crucial in understanding their utility in research, education, and clinical applications. This study aims to assess and compare the effectiveness of Large Language Models (LLMs) in answering neurophysiology questions in both English and Persian (Farsi) covering a range of topics and cognitive levels. Twenty questions covering four topics (general, sensory system, motor system, and integrative) and two cognitive levels (lower-order and higher-order) were posed to the LLMs. Physiologists scored the essay-style answers on a scale of 0–5 points. Statistical analysis compared the scores across different levels such as model, language, topic, and cognitive levels. Performing qualitative analysis identified reasoning gaps. In general, the models demonstrated good performance (mean score = 3.87/5), with no significant difference between language or cognitive levels. The performance was the strongest in the motor system (mean = 4.41) while the weakest was observed in integrative topics (mean = 3.35). Detailed qualitative analysis uncovered deficiencies in reasoning, discerning priorities, and knowledge integrating. This study offers valuable insights into LLMs' capabilities and limitations in the field of neurophysiology. The models demonstrate proficiency in general questions but face challenges in advanced reasoning and knowledge integration. Targeted training could address gaps in knowledge and causal reasoning. As LLMs evolve, rigorous domain-specific assessments will be crucial for evaluating advancements in their performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. LMCrot: an enhanced protein crotonylation site predictor by leveraging an interpretable window-level embedding from a transformer-based protein language model.
- Author
-
Pratyush, Pawel, Bahmani, Soufia, Pokharel, Suresh, Ismail, Hamid D, and KC, Dukka B
- Subjects
- *
LANGUAGE models , *PROTEIN models , *TRANSFORMER models , *NATURAL language processing , *AMINO acid sequence , *LATENT variables - Abstract
Motivation Recent advancements in natural language processing have highlighted the effectiveness of global contextualized representations from protein language models (pLMs) in numerous downstream tasks. Nonetheless, strategies to encode the site-of-interest leveraging pLMs for per-residue prediction tasks, such as crotonylation (Kcr) prediction, remain largely uncharted. Results Herein, we adopt a range of approaches for utilizing pLMs by experimenting with different input sequence types (full-length protein sequence versus window sequence), assessing the implications of utilizing per-residue embedding of the site-of-interest as well as embeddings of window residues centered around it. Building upon these insights, we developed a novel residual ConvBiLSTM network designed to process window-level embeddings of the site-of-interest generated by the ProtT5-XL-UniRef50 pLM using full-length sequences as input. This model, termed T5ResConvBiLSTM, surpasses existing state-of-the-art Kcr predictors in performance across three diverse datasets. To validate our approach of utilizing full sequence-based window-level embeddings, we also delved into the interpretability of ProtT5-derived embedding tensors in two ways: firstly, by scrutinizing the attention weights obtained from the transformer's encoder block; and secondly, by computing SHAP values for these tensors, providing a model-agnostic interpretation of the prediction results. Additionally, we enhance the latent representation of ProtT5 by incorporating two additional local representations, one derived from amino acid properties and the other from supervised embedding layer, through an intermediate fusion stacked generalization approach, using an n -mer window sequence (or, peptide/fragment). The resultant stacked model, dubbed LMCrot, exhibits a more pronounced improvement in predictive performance across the tested datasets. Availability and implementation LMCrot is publicly available at https://github.com/KCLabMTU/LMCrot. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. ChatGPT for Tinnitus Information and Support: Response Accuracy and Retest after Three and Six Months.
- Author
-
Jedrzejczak, W. Wiktor, Skarzynski, Piotr H., Raj-Koziak, Danuta, Sanfins, Milaine Dominici, Hatzopoulos, Stavros, and Kochanek, Krzysztof
- Subjects
- *
CHATGPT , *LANGUAGE models , *NATURAL language processing , *TINNITUS , *LIKERT scale - Abstract
Testing of ChatGPT has recently been performed over a diverse range of topics. However, most of these assessments have been based on broad domains of knowledge. Here, we test ChatGPT's knowledge of tinnitus, an important but specialized aspect of audiology and otolaryngology. Testing involved evaluating ChatGPT's answers to a defined set of 10 questions on tinnitus. Furthermore, given the technology is advancing quickly, we re-evaluated the responses to the same 10 questions 3 and 6 months later. The accuracy of the responses was rated by 6 experts (the authors) using a Likert scale ranging from 1 to 5. Most of ChatGPT's responses were rated as satisfactory or better. However, we did detect a few instances where the responses were not accurate and might be considered somewhat misleading. Over the first 3 months, the ratings generally improved, but there was no more significant improvement at 6 months. In our judgment, ChatGPT provided unexpectedly good responses, given that the questions were quite specific. Although no potentially harmful errors were identified, some mistakes could be seen as somewhat misleading. ChatGPT shows great potential if further developed by experts in specific areas, but for now, it is not yet ready for serious application. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Letter to the Editor: Value-based Healthcare: Can Generative Artificial Intelligence and Large Language Models be a Catalyst for Value-based Healthcare?
- Author
-
Porter, Matt A.
- Subjects
- *
GENERATIVE artificial intelligence , *LANGUAGE models , *ARTIFICIAL neural networks , *VALUE-based healthcare , *NATURAL language processing , *AKAIKE information criterion - Abstract
This document is a letter to the editor discussing the potential of generative artificial intelligence (AI) and large language models (LLMs) to contribute to value-based healthcare. The letter highlights recent regulatory actions and advancements in AI technology, such as deep learning and transformer LLMs. It emphasizes the need for clinical leadership and alignment between AI policy and healthcare goals. The author suggests that AI systems should support human intelligence and decision-making while prioritizing patient-centered care. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
36. Comparative Analysis of Artificial Intelligence Virtual Assistant and Large Language Models in Post-Operative Care.
- Author
-
Borna, Sahar, Gomez-Cabello, Cesar A., Pressman, Sophia M., Haider, Syed Ali, Sehgal, Ajai, Leibovich, Bradley C., Cole, Dave, and Forte, Antonio Jorge
- Subjects
- *
LANGUAGE models , *ARTIFICIAL intelligence , *NATURAL language processing , *POSTOPERATIVE care , *GEMINI (Chatbot) - Abstract
In postoperative care, patient education and follow-up are pivotal for enhancing the quality of care and satisfaction. Artificial intelligence virtual assistants (AIVA) and large language models (LLMs) like Google BARD and ChatGPT-4 offer avenues for addressing patient queries using natural language processing (NLP) techniques. However, the accuracy and appropriateness of the information vary across these platforms, necessitating a comparative study to evaluate their efficacy in this domain. We conducted a study comparing AIVA (using Google Dialogflow) with ChatGPT-4 and Google BARD, assessing the accuracy, knowledge gap, and response appropriateness. AIVA demonstrated superior performance, with significantly higher accuracy (mean: 0.9) and lower knowledge gap (mean: 0.1) compared to BARD and ChatGPT-4. Additionally, AIVA's responses received higher Likert scores for appropriateness. Our findings suggest that specialized AI tools like AIVA are more effective in delivering precise and contextually relevant information for postoperative care compared to general-purpose LLMs. While ChatGPT-4 shows promise, its performance varies, particularly in verbal interactions. This underscores the importance of tailored AI solutions in healthcare, where accuracy and clarity are paramount. Our study highlights the necessity for further research and the development of customized AI solutions to address specific medical contexts and improve patient outcomes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Artificial Intelligence Algorithms for Expert Identification in Medical Domains: A Scoping Review.
- Author
-
Borna, Sahar, Barry, Barbara A., Makarova, Svetlana, Parte, Yogesh, Haider, Clifton R., Sehgal, Ajai, Leibovich, Bradley C., and Forte, Antonio Jorge
- Subjects
- *
MACHINE learning , *ARTIFICIAL intelligence , *NATURAL language processing , *ALGORITHMS , *LANGUAGE models , *SUCCESS , *KNOWLEDGE management - Abstract
With abundant information and interconnectedness among people, identifying knowledgeable individuals in specific domains has become crucial for organizations. Artificial intelligence (AI) algorithms have been employed to evaluate the knowledge and locate experts in specific areas, alleviating the manual burden of expert profiling and identification. However, there is a limited body of research exploring the application of AI algorithms for expert finding in the medical and biomedical fields. This study aims to conduct a scoping review of existing literature on utilizing AI algorithms for expert identification in medical domains. We systematically searched five platforms using a customized search string, and 21 studies were identified through other sources. The search spanned studies up to 2023, and study eligibility and selection adhered to the PRISMA 2020 statement. A total of 571 studies were assessed from the search. Out of these, we included six studies conducted between 2014 and 2020 that met our review criteria. Four studies used a machine learning algorithm as their model, while two utilized natural language processing. One study combined both approaches. All six studies demonstrated significant success in expert retrieval compared to baseline algorithms, as measured by various scoring metrics. AI enhances expert finding accuracy and effectiveness. However, more work is needed in intelligent medical expert retrieval. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. SocialNER2.0: A comprehensive dataset for enhancing named entity recognition in short human-produced text.
- Author
-
Belbekri, Adel, Benchikha, Fouzia, Slimani, Yahya, and Marir, Naila
- Subjects
- *
DEEP learning , *LANGUAGE models , *NATURAL language processing - Abstract
Named Entity Recognition (NER) is an essential task in Natural Language Processing (NLP), and deep learning-based models have shown outstanding performance. However, the effectiveness of deep learning models in NER relies heavily on the quality and quantity of labeled training datasets available. A novel and comprehensive training dataset called SocialNER2.0 is proposed to address this challenge. Based on selected datasets dedicated to different tasks related to NER, the SocialNER2.0 construction process involves data selection, extraction, enrichment, conversion, and balancing steps. The pre-trained BERT (Bidirectional Encoder Representations from Transformers) model is fine-tuned using the proposed dataset. Experimental results highlight the superior performance of the fine-tuned BERT in accurately identifying named entities, demonstrating the SocialNER2.0 dataset's capacity to provide valuable training data for performing NER in human-produced texts. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Identification and Characterization of Synthetic Nicotine Product Promotion and Sales on Instagram Using Natural Language Processing.
- Author
-
Shah, Neal A, Li, Zhuoran, McMann, Tiana, Calac, Alec J, Le, Nicolette, Nali, Matthew C, Cuomo, Raphael E, and Mackey, Tim K
- Subjects
- *
LANGUAGE models , *NATURAL language processing , *SYNTHETIC products , *SOCIAL media , *ELECTRONIC cigarettes - Abstract
Introduction There has been a rapid proliferation of synthetic nicotine products in recent years, despite newly established regulatory authority and limited research into its health risks. Previous research has implicated social media platforms as an avenue for nicotine product unregulated sales. Yet, little is known about synthetic nicotine product content on social media. We utilized natural language processing to characterize the sales of synthetic nicotine products on Instagram. Methods We collected Instagram posts by querying Instagram hashtags (eg, "#tobaccofreenicotine) related to synthetic nicotine. Using Bidirectional Encoder Representations from Transformers, collected posts were categorized into thematically related topic clusters. Posts within topic clusters relevant to study aims were then manually annotated for variables related to promotion and selling (eg, cost discussion, contact information for offline sales). Results A total of 7425 unique posts were collected with 2219 posts identified as related to promotion and selling of synthetic nicotine products. Nicotine pouches (52.9%, n = 1174), electronic nicotine delivery systems (30.6%, n = 679), and flavored e-liquids (14.1%, n = 313) were most commonly promoted. About 16.1% (n = 345) of posts contained embedded hyperlinks and 5.8% (n = 129) provided contact information for purported offline transactions. Only 17.6% (n = 391) of posts contained synthetic nicotine-specific health warnings. Conclusions In the United States, synthetic nicotine products can only be legally marketed if they have received premarket authorization from the Food and Drug Administration (FDA). Despite these prohibitions, Instagram appears to be a hub for potentially unregulated sales of synthetic and "tobacco-free" products. Efforts are needed by platforms and regulators to enhance content moderation and prevent unregulated online sales of existing and emerging synthetic nicotine products. Implications There is limited clinical understanding of synthetic nicotine's unique health risks and how these novel products are changing over time due to regulatory oversight. Despite synthetic nicotine-specific regulatory measures, such as the requirement for premarket authorization and FDA warning letters issued to unauthorized sellers, access to and promotion of synthetic nicotine is widely occurring on Instagram, a platform with over 2 billion users and one that is popular among youth and young adults. Activities include direct-to-consumer sales from questionable sources, inadequate health warning disclosure, and exposure with limited age restrictions, all conditions necessary for the sale of various tobacco products. Notably, the number of these Instagram posts increased in response to the announcement of new FDA regulations. In response, more robust online monitoring, content moderation, and proactive enforcement are needed from platforms who should work collaboratively with regulators to identify, report, and remove content in clear violation of platform policies and federal laws. Regulatory implementation and enforcement should prioritize digital platforms as conduits for unregulated access to synthetic nicotine products and other future novel and emerging tobacco products. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Text-to-SQL: A methodical review of challenges and models.
- Author
-
KANBUROĞLU, Ali Buğra and TEK, F. Boray
- Subjects
- *
LANGUAGE models , *NATURAL language processing , *NATURAL languages - Abstract
This survey focuses on Text-to-SQL, automated translation of natural language queries into SQL queries. Initially, we describe the problem and its main challenges. Then, by following the PRISMA systematic review methodology, we survey the existing Text-to-SQL review papers in the literature. We apply the same method to extract proposed Text-to-SQL models and classify them with respect to used evaluation metrics and benchmarks. We highlight the accuracies achieved by various models on Text-to-SQL datasets and discuss execution-guided evaluation strategies. We present insights into model training times and implementations of different models. We also explore the availability of Text-to-SQL datasets in non-English languages. Additionally, we focus on large language model (LLM) based approaches for the Text-to-SQL task, where we examine LLM-based studies in the literature and subsequently evaluate the LLMs on the cross-domain Spider dataset. Finally, we conclude with a discussion of future directions for Text-to-SQL research, identifying potential areas of improvement and advancements in this field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Recent & Relevant.
- Subjects
- *
SOCIAL media , *GENERATIVE artificial intelligence , *NATURAL language processing , *SCIENTIFIC communication , *LANGUAGE models , *TEXT messages , *BUSINESS communication - Abstract
The article focuses on how customers express satisfaction or dissatisfaction in online restaurant reviews through the use of adversative connective constructions (ACs) and star ratings. It analyzing nearly 35,000 online restaurant reviews, the study reveals a significant relationship between the types of ACs used by customers and the star ratings they assign, providing valuable insights for restaurant owners to understand and prioritize key information in customer reviews.
- Published
- 2024
42. Surveying biomedical relation extraction: a critical examination of current datasets and the proposal of a new resource.
- Author
-
Huang, Ming-Siang, Han, Jen-Chieh, Lin, Pei-Yen, You, Yu-Ting, Tsai, Richard Tzong-Han, and Hsu, Wen-Lian
- Subjects
- *
LANGUAGE models , *CRITICAL currents , *DRUG discovery , *PROTEIN-protein interactions , *NATURAL language processing , *CELLULAR signal transduction - Abstract
Natural language processing (NLP) has become an essential technique in various fields, offering a wide range of possibilities for analyzing data and developing diverse NLP tasks. In the biomedical domain, understanding the complex relationships between compounds and proteins is critical, especially in the context of signal transduction and biochemical pathways. Among these relationships, protein–protein interactions (PPIs) are of particular interest, given their potential to trigger a variety of biological reactions. To improve the ability to predict PPI events, we propose the protein event detection dataset (PEDD), which comprises 6823 abstracts, 39 488 sentences and 182 937 gene pairs. Our PEDD dataset has been utilized in the AI CUP Biomedical Paper Analysis competition, where systems are challenged to predict 12 different relation types. In this paper, we review the state-of-the-art relation extraction research and provide an overview of the PEDD's compilation process. Furthermore, we present the results of the PPI extraction competition and evaluate several language models' performances on the PEDD. This paper's outcomes will provide a valuable roadmap for future studies on protein event detection in NLP. By addressing this critical challenge, we hope to enable breakthroughs in drug discovery and enhance our understanding of the molecular mechanisms underlying various diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Dependency Structure from Syntax to Discourse. A Corpus Study of Journalistic English: by Hongxin Zhang, London, Routledge, 2023, 296 pp., £130.00 (hardback), ISBN 978-1-032-56710-5.
- Author
-
Zhou, Yikai
- Subjects
- *
ENGLISH language , *NATURAL language processing , *LANGUAGE models , *SYNTAX (Grammar) , *DISCOURSE , *GREY relational analysis - Abstract
"Dependency Structure from Syntax to Discourse" by Hongxin Zhang is a book that applies quantitative linguistics to the study of journalistic English. The book explores the use of dependency grammar in analyzing discourse and covers topics such as syntactic and discourse dependencies. The author uses the Wall Street Journal section of the Penn Treebank for the study and provides insights into the frequency distribution, motifs, valency, and distance of discourse dependency relations. The book presents findings on syntactic and discourse dependencies, discusses language transitions, and suggests future research directions. However, it is limited to one language and genre and contains some minor errors. Overall, the book offers new perspectives for quantitative linguistics and discourse studies. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
44. Multimodal prediction of student performance: A fusion of signed graph neural networks and large language models.
- Author
-
Wang, Sijie, Ni, Lin, Zhang, Zeyu, Li, Xiaoxuan, Zheng, Xianda, and Liu, Jiamou
- Subjects
- *
GRAPH neural networks , *LANGUAGE models , *BIPARTITE graphs , *NATURAL language processing , *SCHOOL dropout prevention , *AT-risk students - Abstract
In online education platforms, accurately predicting student performance is essential for timely dropout prevention and interventions for at-risk students. This task is made difficult by the prevalent use of Multiple-Choice Questions (MCQs) in learnersourcing platforms, where noise in student-generated content and the limitations of existing unsigned graph-based models, specifically their inability to distinguish the semantic meaning between correct and incorrect responses, hinder accurate performance predictions. To address these issues, we introduce the L arge L anguage M odel enhanced S igned B ipartite graph C ontrastive L earning (LLM-SBCL) model—a novel Multimodal Model utilizing Signed Graph Neural Networks (SGNNs) and a Large Language Model (LLM). Our model uses a signed bipartite graph to represent students' answers, with positive and negative edges denoting correct and incorrect responses, respectively. To mitigate noise impact, we apply contrastive learning to the signed graphs, combined with knowledge point embeddings from the LLM to further enhance our model's predictive performance. Upon evaluating our model on five real-world datasets, it demonstrates superior accuracy and stability, exhibiting an average F1 improvement of 3.7% over the best baseline models. • Student-question interactions modeled via a signed bipartite graph. • Problem cast as link sign prediction in signed bipartite graph. • Contrastive learning employed to handle student content noise. • Using large language model to extract knowledge from questions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Human‐in‐the‐loop: Human involvement in enhancing medical inquiry performance in large language models.
- Author
-
Shu, Linping, He, Qunshan, Yan, Bing, Wu, Di, Wang, Menglin, Wang, Chengshuo, and Zhang, Luo
- Subjects
- *
LANGUAGE models , *NATURAL language processing , *MEDICAL writing - Abstract
This article discusses the role of human involvement in enhancing the performance of large language models (LLMs) in medical inquiry. The authors highlight the occasional shortcomings of LLMs in providing accurate citation information and accessing real-time data. They recommend prompt engineering as a way to enhance model performance, which involves carefully crafting instructions or queries given to LLMs to elicit specific and desired responses. The article also discusses the importance of verifying LLM outputs and acknowledges the limitations of LLMs in medical diagnoses and personalized advice. The authors conclude that the judicious implementation of the "human-in-the-loop" strategy, with a focus on prompt engineering, can greatly improve LLM capabilities in medical inquiry. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
46. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports.
- Author
-
Jeblick, Katharina, Schachtner, Balthasar, Dexl, Jakob, Mittermeier, Andreas, Stüber, Anna Theresa, Topalis, Johanna, Weber, Tobias, Wesp, Philipp, Sabel, Bastian Oliver, Ricke, Jens, and Ingrisch, Michael
- Subjects
- *
CHATGPT , *LANGUAGE models , *NATURAL language processing , *RADIOLOGY , *PATIENT-centered care , *RADIOLOGIC technologists , *DIAGNOSTIC ultrasonic imaging personnel - Abstract
Objectives: To assess the quality of simplified radiology reports generated with the large language model (LLM) ChatGPT and to discuss challenges and chances of ChatGPT-like LLMs for medical text simplification. Methods: In this exploratory case study, a radiologist created three fictitious radiology reports which we simplified by prompting ChatGPT with "Explain this medical report to a child using simple language." In a questionnaire, we tasked 15 radiologists to rate the quality of the simplified radiology reports with respect to their factual correctness, completeness, and potential harm for patients. We used Likert scale analysis and inductive free-text categorization to assess the quality of the simplified reports. Results: Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed relevant medical information, and potentially harmful passages were reported. Conclusion: While we see a need for further adaption to the medical field, the initial insights of this study indicate a tremendous potential in using LLMs like ChatGPT to improve patient-centered care in radiology and other medical domains. Clinical relevance statement: Patients have started to use ChatGPT to simplify and explain their medical reports, which is expected to affect patient-doctor interaction. This phenomenon raises several opportunities and challenges for clinical routine. Key Points: • Patients have started to use ChatGPT to simplify their medical reports, but their quality was unknown. • In a questionnaire, most participating radiologists overall asserted good quality to radiology reports simplified with ChatGPT. However, they also highlighted a notable presence of errors, potentially leading patients to draw harmful conclusions. • Large language models such as ChatGPT have vast potential to enhance patient-centered care in radiology and other medical domains. To realize this potential while minimizing harm, they need supervision by medical experts and adaption to the medical field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Small2BERT for extractive text summarization.
- Author
-
Tantius, Cornelius, Shintaro, Chrismorgan, Soelistio, Elizabeth Ann, Kristanto, Jonathan, Nelson, Rico, and Girsang, Abba Suganda
- Subjects
- *
AUTOMATIC summarization , *LANGUAGE models , *TEXT summarization , *NATURAL language processing , *WORD frequency , *MACHINE learning - Abstract
Bidirectional Encoder Representations from Transformers (BERT) is one method technique in machine learning for solving various natural language processing (NLP) problems, including summarization. One general problem using BERT is getting much time and resources to train the model. This paper aims to implement the small architecture BERT, called Small2BERT, which is expected to summarize the data. Therefore, this study compares pre-trained Small2BERT and Summarization using Word Frequency (SWF) in extracting text summaries. This research uses the dataset of Indian News Summary, which includes news articles from the Hindu, Indian Times, and Guardian from February to August 2017. Small2BERT surpasses SWF in precision, whereas SWF gives a higher score in overall high f1-score. Small2BERT earns a 0.307 f1-score in Rogue-1, compared to 0.35 for SWF. Small2BERT scores 0.19 and 0.304 while SWF achieves 0.304 and 0.31 in f1-score for Rogue-2 and Rogue-L, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Transformer-based models for hate speech classification.
- Author
-
Jain, Deepti, Arora, Sandhya, Jha, C. K., and Malik, Garima
- Subjects
- *
HATE speech , *DEEP learning , *NATURAL language processing , *TRANSFORMER models , *LANGUAGE models , *MACHINE learning - Abstract
This research paper explores the application of text classification and natural language processing techniques for enhancing hate speech detection. The study employs machine learning (ML) and deep learning models, including transformer models such as BERT, RoBERTa, and DistilBERT, to improve the accuracy of hate speech classifiers. Through a comprehensive empirical analysis on three diverse datasets (Data-ICWSM, Data-ALW2, Data-OLID), the study demonstrates the effectiveness of these models in accurately identifying hate speech. Compared to traditional baselines, the BERT models exhibit a significant performance boost in macro and weighted F1-scores. Additionally, the study addresses the challenge of imbalanced class distributions in the datasets by employing sampling techniques during training. Overall, the research highlights the potential of transformer models for hate speech detection and provides insights for future exploration in this domain. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. How to Mitigate Hallucination Risk in GenAI.
- Author
-
Black, Lamont and Stern, Matthew
- Subjects
- *
GENERATIVE artificial intelligence , *HALLUCINATIONS , *NATURAL language processing , *ACCOUNTING standards , *LANGUAGE models - Abstract
The article shares strategies to accounting and finance professionals to mitigate hallucination risk and safeguard against threat of incorrect or misleading information in generative artificial intelligence (AI). These include setting the precision parameters of chatbots for large language models (LLM), prompt engineering, checking the references provided, uploading documents and files, fine-tuning of an AI model on a specialized data set, and use of retrieval augmented generation system.
- Published
- 2024
50. LEADERSHIP DEVELOPMENT PROGRAMS NEED A REFRESH.
- Author
-
JULIAN, AMANDA
- Subjects
- *
LEADERSHIP training , *ARTIFICIAL intelligence , *LANGUAGE models , *GENERATIVE artificial intelligence , *NATURAL language processing - Abstract
Leadership development programs may not be meeting the needs of today's business world and modern workers, according to survey data. Traditional approaches such as coaching, workshops, and online courses are limited in their ability to deliver significant outcomes. To address this, a paradigm shift is needed in leadership development, embracing technological advancements such as artificial intelligence (AI). AI can personalize training experiences, provide tailored feedback and support, and offer real-time, in-the-moment assistance. Additionally, incorporating community-based learning and maintaining a balance between technological advancement and human interaction are crucial considerations. [Extracted from the article]
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.