Descriptor: "*NATURAL language processing" / Language: english / Publication Year Range: Last 3 years - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"*NATURAL language processing"' showing total 5,155 results

Start Over Descriptor "*NATURAL language processing" Publication Year Range Last 3 years Language english

5,155 results on '"*NATURAL language processing"'

1. The Science of Detecting LLM-Generated Text.

Author: Tang, Ruixiang, Chuang, Yu-Neng, and Hu, Xia
Subjects: *LANGUAGE models, *NATURAL language processing, *COMPUTATIONAL linguistics, *CHATGPT, *CHATBOTS, *ARTIFICIAL intelligence, *SEMANTIC computing
Abstract: This research article focuses on the science of detecting large language model (LLM) generated text. The authors discuss the advancement of natural language generation (NLG) technology, including OpenAI's ChatGPT, and explain how the two detection methods, black-box detection and white-box detection, work to mitigate the potential misuse of LLMs.
Published: 2024
Full Text: View/download PDF

2. Shortcut Learning of Large Language Models in Natural Language Understanding.

Author: MENGNAN DU, FENGXIANG HE, NA ZOU, DACHENG TAO, and XIA HU
Subjects: *LANGUAGE models, *NATURAL language processing, *ARTIFICIAL intelligence, *MACHINE learning, *ALGORITHMS, *INDUCTION (Logic)
Abstract: The article looks at the use of large language models to carry out natural language understanding (NLU) tasks. It suggests that the shortcut learning common to existing large language models based on machine learning limits how robust their performance can be because they are overly dependent on spurious correlations and incidental relationships. It discusses possible approaches to overcoming this problem in the future development of large language models.
Published: 2024
Full Text: View/download PDF

3. Flower shop mobile chatbot.

Author: Tan, Andrew Xue-Yee, Chong, Lee-Ying, Chong, Siew-Chin, and Goh, Pey-Yun
Subjects: *CHATBOTS, *NATURAL language processing, *DIGITAL technology, *MACHINE learning, *COMPUTER software, *CUSTOMER satisfaction
Abstract: Chatbot is a computer program that mimics human conversation and performs streamlined communication with the user through text messages. The usage of chatbots in the commercial field has been increasing recently due to the enhancement of customer satisfaction and product sales. However, most chatbots depend on antiquated models that can only provide pre-programmed responses, which is unclear and erroneous. This paper aims to build an automated mobile chatbot for a flower shop to assist them in reducing costs, boosting the business's growth, and establishing connections with clients in an increasingly digital environment. The majority of robots have been around for about ten years and work by following rules. These kinds of chatbot answer to customers based on the rules for the conversation. The proposed chatbot is implemented by machine learning approaches and natural language processing algorithms to enhance the chatbot's adaptability and human-like qualities, the chatbot are learning from the unsolved question based on the admin given answer. Besides, the admin interface is created to ease the data management in the proposed chatbot. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. VADER-IT: A sentiment analysis tool for the Italian language.

Author: Martinis, Maria Chiara, Zucco, Chiara, and Cannataro, Mario
Subjects: *ITALIAN language, *SENTIMENT analysis, *DEEP learning, *TRANSFORMER models, *MEDICAL writing, *NATURAL language processing, *TEXT recognition
Abstract: "Polarity detection" is a technique used in the field of sentiment analysis in texts, a subset of natural language processing (NLP), which determines the emotional polarity of a text, i.e., whether it expresses a positive, negative, or neutral opinion. Despite the impact registered in the field of Natural Language Processing by Deep Learning techniques and, in particular, by Transformers, these approaches come not without downsides, generally related to low-sources languages and domains. In this article, a lexicon-based approach is applied to extract polarity from medical reviews written in Italian from an Italian portal. Specifically, Vader-IT, an adaptation of the popular VADER sentiment analysis tool tailored for the Italian language, is employed to predict the overall sentiment expressed in 5,491 Italian-language reviews about several Italian hospital departments specialized in heart-related diseases. The reviews are accessible on the Qsalute website urlhttps://www.qsalute.it/. The results show a micro averaged F1 −score = 90.8% and a micro averaged Jaccard −score = 83%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Chatbot application training using natural language processing techniques: Case of small-scale agriculture.

Author: Ong, R. J., Raof, R. A. A., Sudin, S., and Choong, K. Y.
Subjects: *CHATBOTS, *DATABASES, *NATURAL language processing, *TACIT knowledge, *AGRICULTURE
Abstract: Tacit knowledge, which is based on first-hand experience and is more difficult to articulate, has evolved alongside natural languages as they are passed down through the years. In computing, Natural Language Processing (or NLP) refers to a set of methods for studying and modelling human languages that may be studied and represented automatically. Extracting or searching through vast bodies of unregulated text for specific information can be a complex and time-consuming process. Knowledge comes in several shapes and sizes, but can usually be differentiated into two types: structured or unstructured. Using NLP techniques, unstructured text data can be translated into a structured and well-organized database and then used for question-answering purposes. This paper is about the implementation of NLP techniques to convert unstructured text data into a structured database for Chatbot application training. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Arabic automatic question generation using transformer model.

Author: Alhashedi, Saleh Saleh, Suaib, Norhaida Mohd, and Bakri, Aryati
Subjects: *TRANSFORMER models, *NATURAL language processing, *INTERNET content, *CHILDREN'S books, *ELECTRONIC textbooks, *LINGUISTIC models
Abstract: Students of all ages benefit greatly from the use of questions in the evaluation process and in the improvement of their overall educational outcomes. The educational process's adaptation, shift to online education, and the rapid growth of educational content on the internet. Institutions, schools, and academic organisations struggle to generate exam questions in a timely manner due to the use of the outdated method. Exam question preparation is a complex and time-consuming activity that calls for an in-depth familiarity with the subject matter and the skill to build the questions, both of which grow more challenging as text size increases. Generating questions that are both natural and relevant from a variety of text data inputs, with the possibility to provide an answer, is the goal of automatic question generation (AQG). The Arabic language has seen a small number of contributions to this problem-solving effort. Many existing works rely on Rule-based methods and input text from children's books, stories, or textbooks to manually construct question styles. There is a lack of linguistic diversity in these models, and the tasks get increasingly difficult and time-consuming as the quantity of the text increases. When it comes to Natural Language Processing (NLP), Transformer is one of the most flexible deep-learning models. In this research, we propose a fully-automated Arabic AAQG model built on the Transformer architecture, which can take a single document of limitless length in Arabic and create N questions from it. These questions can be used in educational contexts. Our model achieves performance results with (19.12 BLEU, 23.00 METEOR, and 51.99 ROUGE-L) using mMARCO dataset. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Song lyrics genre detection using RNN.

Author: Pasha, Syed Nawaz, Ramesh, Dadi, Mohmmad, Sallauddin, Shabana, Kothandaraman, D., and Sravanthi, T.
Subjects: *SONG lyrics, *DIGITAL music, *SHORT-term memory, *BIRDSONGS, *LONG-term memory, *NATURAL language processing, *MUSICAL aesthetics
Abstract: Digitalization of music is the new trend, and preferences of individuals are highly rated. Millions of songs are being streamed in the music applications. The companies providing these services need to sort and arrange a wide range of music tastes for all of its users. On top of that, fresh music from various artists in a wide spectrum of genres are popping up every day. To keep track of all this, a classification system can be handy. So, we propose an RNN based model based on Natural Language processing to classify the songs based on their lyrics into different genres [1]. Additionally, this tool can be handy to the music lovers for quickly identifying which genre a particular song belongs to. In this paper, we apply Long Short Term Memory (LSTM) model with both Universal Serial Embedder (USE) and Bert embedders. A comparative study is performed to understand which combination of models works based to classify the genres based on lyrics. From our results, on the basis of accuracy of the model, we found that USE embedder with LSTM [2] gives a slightly better performance than Bert embedder. The LSTM model with USE embedding gave the highest accuracy of 83.42% when trained over a range of five folds. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Automatic essay scoring using NLP.

Author: Sheshikala, M., Rajesh, Mothe, and Akarapu, Mahesh
Subjects: *CONVOLUTIONAL neural networks, *NATURAL language processing, *ESSAYS, *ARTIFICIAL intelligence, *MANUAL labor, *TRANSFORMER models
Abstract: It is a known fact that any update in the history of educational sector has always been a positive impact in the livelihood of people towards technology. Our project is one such a kind where rating essays is the major criteria we want to work on. Essay evaluation is considered as a systematic way to give rating to the essays written. Automatic essay scoring is a process of grading essays without human intervention. The computer systems are trained using technical, artificial intelligence architectures where natural language processing comes into picture. The process of making machine resembles to the human intelligence and to work, as if as a human could is the main motive of natural language processing. Under this criterion, we have chosen a part of educational preview to build a system that is capable of rating written work, namely essays. Our project aims to provide a solution that evaluates essays as an automatic process. The basic idea here is to develop a software system that can be beneficial to educational institutions, business organizations, researchers, etc. Automatic essay scoring has a powerful gain over making it work, because it helps in reduction of manual work, gives a scope for every element without bias, also act as a key role in being time-efficient. There are past approaches in finding a way to develop an automated system to score essays using regression analysis, convolution neural networks, while we worked through transformer-based model, named BERT. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. A natural language based intelligent banking chatbot.

Author: Pasha, Nawaz, Ramesh, Dadi, Mohmmad, Sallauddin, Shabana, Dhandapani, Kothandaraman, and Mendu, Mruthyunjaya
Subjects: *CHATBOTS, *NATURAL language processing, *NATURAL languages, *WEB-based user interfaces, *CALL centers, *DEVELOPMENT banks
Abstract: Chatbot is an intelligent system which simply defines human-to-machine interaction and this chatbots are biggest development trend today. Contacting customer centers or going to bank for banking related queries invests lot of time and human effort, further the customer may get insufficient information and may have uncertainty through this process. This paper aims to convey the development of Banking Chatbot that gives guidance regarding the services provided by bank and provides detailed descriptions to the user's query. This Chatbot is user friendly to communicate with and it is an easy way to get response on time. To overcome the difficulty web application using Natural language processing and neural network is developed [1]. NLP is an added advantage to make chatbot understand user queries. The operations in this Banking Chatbot include viewing beneficiaries and post queries regarding banking services. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. Building an NLP based speech recognition technology for emergency call centers.

Author: Erukala, Sudarshan, Reddy, Prabhakar, Ramesh, Oruganti, Ramesh, Nagaram, Kumar, Atul, Prabhanjan, Bonthala, and Bolukonda, Prashanth
Subjects: *SPEECH perception, *ARTIFICIAL neural networks, *LANGUAGE models, *CALL centers, *GAUSSIAN mixture models, *NATURAL language processing, *AUTOMATIC speech recognition
Abstract: The approaches of automated speech identification for spoken conversations in emergencies call centres were explored and compared therefore in research. These methodology included acoustic and linguistic models, as well as labelling techniques. Currently present speech recognition algorithms perform poorly because contact centre discussion speech has special context and is spoken in loud, emotional contexts. Consequently, the primary components of speaker verification designs and acoustical training methodologies—as well such Various investigations and analyses of symmetrical information labelling methods were performed. Various variations of Deep Neural Network/Hidden Markov Model (DNN/ HMM) and Gaussian Mixture Model/Hidden Markov Model (GMM/HMM) approaches might have been implemented and tested in order to establish an efficient language framework for conversation information. Furthermore, useful conversation system language models developed Using intrinsic and extrinsic criteria, outlined Finally, when these recommended information labelling techniques with spelling correction are compared with typical labelling techniques, they dominate the other methodologies by a significant proportion. Using the investigation's findings as a guide, we found Showed the use of spelling adjustments prior to training information for a labelling approach, trigram with Kneser-Ney discounting for a language model, and DNN/HMM for an acoustic model are efficient setups for conversation voice recognition in emergency call centres. In order to be clear, this study was Done using two distinct datasets that were gathered from emergency calls: the Dialogue dataset (27 h), which comprises the speech of the call agents, and the Summary dataset (53 h), which contains spoken summaries of those conversations summarising emergency situations. Even if the remarks were taken from the Our strategies are loosely related to particular linguistic aspects despite the fact that the emergency contact centre is in the Turkic language family of Azerbaijani, which is spoken there. As a result, it is expected that the recommended ways will also work with the other languages in the same family. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Developing chat server for addressing FAQ's about creative learning.

Author: Joshi, Shridhar, Warik, Akshada, Rathod, Harsh, Jha, Jayshree, and Aher, Anagha
Subjects: *NATURAL language processing, *ONLINE education, *SENTIMENT analysis, *BOTNETS, *TEXT messages, *DATA mining
Abstract: During the pandemic, rapid growth in online learning and educational platforms have been seen. However, the major problem faced by the learners is that they were missing the student-teacher interaction as well as peer-to-peer interaction. To cover up this gap, features like forums, live chat, etc. were introduced. In this project, we will be building a live chat server using Node.js, Express. js, and socket. io. One thing that is observed in these online learning platforms is that the usersmay not always consider the sentiments and feelings of other users present on the live server. To prevent the use of slang, vulgar language, and inappropriate words, we would be using ananalysis bot that will scan for such inappropriate words and if found, the message will be highlighted and will not be uploaded to the server unless corrected. This helps to maintain the quality of chats on the website considering that the website is meant to be a learning platform where users from various backgrounds, age groups, and nationalities will be present. Sentiment analysis, Data Mining, and Natural Language Processing will be used toanalyze the message. The degree of the in appropriateness of the text message will be calculated based on the classification of words done using the Naive-Bayes Classifier. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. A systematic review on various applications and challenges in deep learning.

Author: Yeole, Ashwini N. and S., Guru Prasad M.
Subjects: *MACHINE learning, *DEEP learning, *NATURAL language processing, *IMAGE recognition (Computer vision), *COMPUTER vision, *ARTIFICIAL intelligence
Abstract: Deep learning and machine learning are essential since most companies require smart analytics to stay competitive. Artificial intelligence gave rise to machine learning, which in turn gave rise to deep learning. Machine Learning still dominates business analytics with its algorithms, despite the high-end uses of Deep Learning in areas like Computer Vision and Natural Language Processing. By summarizing the numerous machine learning models that are currently available on the market, this survey article demonstrates the learning transition from machine learning to deep learning. It also provides insight into deep learning models and methodologies, as well as the challenges faced and the expected future course of deep learning. A potent, cutting-edge method for analyzing photos, especially remote sensing (RS) images, is deep learning (DL). Remote sensing image scene classification, which attempts to assign semantic categories to remote sensing images based on their contents, has a variety of applications. Thanks to the powerful feature learning capabilities of DNNs, deep learning-based remote sensing image scene categorization has generated a lot of interest and made significant progress. A range of remote sensing applications, such as estimating water availability, monitoring water change over time, and predicting droughts and floods, can benefit from surface water mapping. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Show It or Tell It? Text, Visualization, and Their Combination: When communicating information, language should be considered as co-equal with visualization.

Author: HEARST, MARTI A.
Subjects: *DATA visualization, *LANGUAGE & languages, *COMMUNICATION in information science, *LITERACY, *NATURAL language processing, *USER interfaces
Abstract: This article emphasizes the corresponding role language should have, along with visualization, in the communication of information. Topics include the combination and balance of text and visualization, an investigation of text without visualization, necessary improvements to cognitive models, and how natural language processing can impact information communication.
Published: 2023
Full Text: View/download PDF

14. Voice in the Machine: Ethical Considerations for Language-Capable Robots: Parsing the promise of language-capable robots.

Author: Williams, T., Matuszek, Cynthia, Jokinen, Kristiina, Korpan, Raj, Pustejovsky, James, and Scassellati, Brian
Subjects: *ARTIFICIAL intelligence & ethics, *ETHICS, *COMPUTATIONAL linguistics, *NATURAL language processing, *DISCRIMINATION (Sociology), *PREJUDICES
Abstract: The article discusses various ethical considerations for language-capable robots. These concerns include trust, influence, identity, and privacy, and will require consideration by researchers, practitioners, and the general public. Various potential negative outcomes are discussed including robot control over human morals, a default identity perception grounded in white heteropatriarchy, gendered and racialized language-capable robots, and the potential for robots to be used as mobile surveillance tools.
Published: 2023
Full Text: View/download PDF

15. A Computational Inflection for Scientific Discovery.

Author: HOPE, TOM, DOWNEY, DOUG, ETZIONI, OREN, WELD, DANIEL S., and HORVITZ, ERIC
Subjects: *SCIENTIFIC knowledge, *LANGUAGE models, *SCIENTIFIC method, *ARTIFICIAL intelligence, *INFORMATION retrieval, *NATURAL language processing, *COGNITION, *HUMAN-artificial intelligence interaction
Abstract: This article presents an overview on task-guided scientific knowledge retrieval as a way for researchers to overcome the limitations of human cognitive capacity that in the age of explosive digital information creates a cognitive bottleneck. Topics include prototypes of task-guided scientific knowledge retrieval, as well as a look at novel representations, tools, and services and a review of systems that aid researchers in all aspects of scientific inquiry and discovery.
Published: 2023
Full Text: View/download PDF

16. Analogous Forecasting for Predicting Sport Innovation Diffusion: From Business Analytics to Natural Language Processing.

Author: Wanless, Liz and Naraine, Michael L.
Subjects: *NATURAL language processing, *SPORTS forecasting, *DIFFUSION of innovations, *BUSINESS analytics, *DIFFUSION of innovations theory, *HOCKEY players, *FUTUROLOGISTS
Abstract: The purpose of this study was to analyze the diffusion of one sport innovation to forecast a second. Contextualized within the diffusion of innovations theory, this study investigated cumulative business analytics diffusion as an analog for cumulative natural language processing (NLP) diffusion in professional sport. A total of 89 teams of the 123 teams in the Big Four North American men's professional sport leagues contributed: 21 from the National Football League, 23 from the National Basketball Association, 22 from Major League Baseball, and 23 from the National Hockey League. Utilizing an analogous forecasting approach, a discrete derivation of the Bass model was applied to cumulative BA adoption data. Parameters were then extended to predict cumulative NLP adoption. Resulting BA-estimated parameters (p =.0072, q =.3644) determined a close fit to NLP diffusion (root mean square error of approximation = 3.51, mean absolute error = 2.98), thereby validating BA to predict the takeoff and full adoption of NLP. This study illuminates an ongoing and isomorphic process for diffusion of innovations in the professional sport social system and generates a novel application of diffusion of innovations theory to the sport industry. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

17. Quantifying social capital creation in post‐disaster recovery aid in Indonesia: methodological innovation by an AI‐based language model.

Author: Marutschke, Daniel Moritz, Nurdin, Muhammad Riza, and Hirono, Miwa
Abstract: Smooth interaction with a disaster‐affected community can create and strengthen its social capital, leading to greater effectiveness in the provision of successful post‐disaster recovery aid. To understand the relationship between the types of interaction, the strength of social capital generated, and the provision of successful post‐disaster recovery aid, intricate ethnographic qualitative research is required, but it is likely to remain illustrative because it is based, at least to some degree, on the researcher's intuition. This paper thus offers an innovative research method employing a quantitative artificial intelligence (AI)‐based language model, which allows researchers to re‐examine data, thereby validating the findings of the qualitative research, and to glean additional insights that might otherwise have been missed. This paper argues that well‐connected personnel and religiously‐based communal activities help to enhance social capital by bonding within a community and linking to outside agencies and that mixed methods, based on the AI‐based language model, effectively strengthen text‐based qualitative research. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Mapping dynamic human sentiments of heat exposure with location-based social media data.

Author: Lyu, Fangzheng, Zhou, Lixuanwu, Park, Jinwoo, Baig, Furqan, and Wang, Shaowen
Subjects: *USER-generated content, *NATURAL language processing, *CITY dwellers, *HEAT waves (Meteorology), *SPATIAL resolution
Abstract: Understanding urban heat exposure dynamics is critical for public health, urban management, and climate change resilience. Near real-time analysis of urban heat enables quick decision-making and timely resource allocation, thereby enhancing the well-being of urban residents, especially during heatwaves or electricity shortages. To serve this purpose, we develop a cyberGIS framework to analyze and visualize human sentiments of heat exposure dynamically based on near real-time location-based social media (LBSM) data. Large volumes and low-cost LBSM data, together with a content analysis algorithm based on natural language processing are used effectively to generate near real-time heat exposure maps from human sentiments on social media at both city and national scales with km spatial resolution and census tract spatial unit. We conducted a case study to visualize and analyze human sentiments of heat exposure in Chicago and the United States in September 2021. Enabled with high-performance computing, dynamic visualization of heat exposure is achieved with fine spatiotemporal scales while heat exposure detected from social media data can be used to understand heat exposure from a human perspective and allow timely responses to extreme heat. Near real-time and high spatial resolution mapping of human sentiments of heat exposure with Twitter data An integrated cyberGIS and machine learning framework for visualizing heat exposure with Twitter data Human sentiment of heat exposure mapping in the City of Chicago and the United States [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. Can ChatGPT Evaluate Plans?

Author: Fu, Xinyu, Wang, Ruoniu, and Li, Chaosu
Subjects: *CHATGPT, *LANGUAGE models, *NATURAL language processing, *HUMAN error
Abstract: Large language models, such as ChatGPT, have recently risen to prominence in producing human-like conversation and assisting with various tasks, particularly for analyzing high-dimensional textual materials. Because planning researchers and practitioners often need to evaluate planning documents that are long and complex, a first-ever possible question has emerged: Can ChatGPT evaluate plans? In this study we addressed this question by leveraging ChatGPT to evaluate the quality of plans and compare the results with those conducted by human coders. Through the evaluation of 10 climate change plans, we discovered that ChatGPT's evaluation results coincided reasonably well (with an average of 68%) with those from the traditional content analysis approach. We further scrutinized the differences by conducting a more in-depth analysis of the results from ChatGPT and manual evaluation to uncover what might have contributed to the variance in results. Our findings indicate that ChatGPT struggled to comprehend planning-specific jargon, yet it could reduce human errors by capturing details in complex planning documents. Finally, we provide insights into leveraging this cutting-edge technology in future planning research and practice. ChatGPT cannot be used to replace humans in plan quality evaluation yet. However, it is an effective tool to complement human coders to minimize human errors by identifying discrepancies and fact-checking machine-generated responses. ChatGPT generally cannot understand planning jargon, so planners wanting to use this tool should use extra caution when planning terminologies are present in their prompts. Creating effective prompts for ChatGPT is an iterative process that requires specific instructions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Utilizing BERT for Information Retrieval: Survey, Applications, Resources, and Challenges.

Author: Wang, Jiajia, Huang, Jimmy Xiangji, Tu, Xinhui, Wang, Junmei, Huang, Angela Jennifer, Laskar, Md Tahmid Rahman, and Bhuiyan, Amran
Published: 2024
Full Text: View/download PDF

21. Multimodal Machine Learning for Prediction of 30-Day Readmission Risk in Elderly Population.

Author: Loutati, Ranel, Ben-Yehuda, Arie, Rosenberg, Shai, and Rottenberg, Yakir
Subjects: *MACHINE learning, *OLDER people, *NATURAL language processing, *PATIENT readmissions, *OLDER patients
Abstract: Readmission within 30 days is a prevalent issue among elderly patients, linked to unfavorable health outcomes. Our objective was to develop and validate multimodal machine learning models for predicting 30-day readmission risk in elderly patients discharged from internal medicine departments. This was a retrospective cohort study which included elderly patients aged 75 or older, who were hospitalized at the Hadassah Medical Center internal medicine departments between 2014 and 2020. Three machine learning algorithms were developed and employed to predict 30-day readmission risk. The primary measures were predictive model performance scores, specifically area under the receiver operator curve (AUROC), and average precision. This study included 19,569 admissions. Of them, 3258 (16.65%) resulted in 30-day readmission. Our 3 proposed models demonstrated high accuracy and precision on an unseen test set, with AUROC values of 0.87, 0.89, and 0.93, respectively, and average precision values of 0.76, 0.78, and 0.81. Feature importance analysis revealed that the number of admissions in the past year, history of 30-day readmission, Charlson score, and admission length were the most influential variables. Notably, the natural language processing score, representing the probability of readmission according to a textual-based model trained on social workers' assessment letters during hospitalization, ranked among the top 10 contributing factors. Leveraging multimodal machine learning offers a promising strategy for identifying elderly patients who are at high risk for 30-day readmission. By identifying these patients, machine learning models may facilitate the effective execution of preventive actions to reduce avoidable readmission incidents. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Calculated Medicine: Seven Decades of Accelerating Growth.

Author: Leff, Louis E. and Koperwas, Mara L.
Subjects: *NATURAL language processing, *CLINICAL decision support systems, *ARTIFICIAL intelligence, *HEALTH equity, *DATABASES
Abstract: The field of Calculated Medicine has grown substantially over the last 7 decades. Comprised of objective, evidence-based medical decision tools, Calculated Medicine has broad application in medical practice, medical research, and health care management. This article reviews the history and varied methodologies of Calculated Medicine, starting with the 1953 Apgar score and concluding with a look into modern computational tools of the field: machine learning, natural language processing, artificial intelligence, and in silico research techniques. We'll also review and quantify the rapidly accelerating growth of Calculated Medicine in the medical literature. Our database of journal articles referring to the field has accumulated over 1.8 million citations, with more than 460 new citations (on average) posted every day. Using natural language processing, we examine and analyze this burgeoning database. Lastly, we examine an important new direction of Calculated Medicine: self-reflection on its potential effect on racial and ethnic disparities in health care. Our field is making great strides promoting health care egality, and some of the most prominent contributions will be reviewed. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Defining suffering in pain: a systematic review on pain-related suffering using natural language processing.

Author: Noe-Steinmüller, Niklas, Scherbakov, Dmitry, Zhuravlyova, Alexandra, Wager, Tor D., Goldstein, Pavel, and Tesarz, Jonas
Subjects: *NATURAL language processing, *LANGUAGE models, *ARTIFICIAL intelligence, *SUFFERING, *MACHINE learning
Abstract: Supplemental Digital Content is Available in the Text. Understanding, measuring, and mitigating pain-related suffering is a key challenge for both clinical care and pain research. However, there is no consensus on what exactly the concept of pain-related suffering includes, and it is often not precisely operationalized in empirical studies. Here, we (1) systematically review the conceptualization of pain-related suffering in the existing literature, (2) develop a definition and a conceptual framework, and (3) use machine learning to cross-validate the results. We identified 111 articles in a systematic search of Web of Science, PubMed, PsychINFO, and PhilPapers for peer-reviewed articles containing conceptual contributions about the experience of pain-related suffering. We developed a new procedure for extracting and synthesizing study information based on the cross-validation of qualitative analysis with an artificial intelligence–based approach grounded in large language models and topic modeling. We derived a definition from the literature that is representative of current theoretical views and describes pain-related suffering as a severely negative, complex, and dynamic experience in response to a perceived threat to an individual's integrity as a self and identity as a person. We also offer a conceptual framework of pain-related suffering distinguishing 8 dimensions: social, physical, personal, spiritual, existential, cultural, cognitive, and affective. Our data show that pain-related suffering is a multidimensional phenomenon that is closely related to but distinct from pain itself. The present analysis provides a roadmap for further theoretical and empirical development. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. Linguistic Features of Secondary School Writing: Can Natural Language Processing Shine a Light on Differences by Sex, English Language Status, or Higher Scoring Essays?

Author: Tate, Tamara P., Kim, Young-Suk Grace, Collins, Penelope, Warschauer, Mark, and Olson, Carol Booth
Subjects: *NATURAL language processing, *SECONDARY schools, *ENGLISH language, *WOMEN authors, *BILINGUAL students
Abstract: This article provides three major contributions to the literature: we provide granular information on the development of student argumentative writing across secondary school; we replicate the MacArthur et al. model of Natural Language Processing (NLP) writing features that predict quality with a younger group of students; and we are able to examine the differences for students across language status. In our study, we sought to find the average levels of text length, cohesion, connectives, syntactic complexity, and word-level complexity in this sample across Grades 7-12 by sex, by English learner status, and for essays scoring above and below the median holistic score. Mean levels of variables by grade suggest a developmental progression with respect to text length, with the text length increasing with grade level, but the other variables in the model were fairly stable. Sex did not seem to affect the model in meaningful ways beyond the increased fluency of women writers. We saw text length and word level differences between initially designated and redesignated bilingual students compared to their English-only peers. Finally, we see that the model works better with our higher scoring essays and is less effective explaining the lower scoring essays. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. Beyond Plagiarism: ChatGPT as the Vanguard of Technological Revolution in Research and Citation.

Author: Flaherty, Hanni B. and Yurch, Jackson
Subjects: *INTELLECT, *PLAGIARISM, *COMPUTER software, *SOCIAL workers, *INTERPROFESSIONAL relations, *ARTIFICIAL intelligence, *NATURAL language processing, *CITATION analysis, *SOCIAL work research, *CREATIVE ability, *TECHNOLOGY, *ABILITY, *TRAINING
Abstract: In the landscape of academic research and citation practices, the emergence of ChatGPT, an artificial intelligence language model developed by OpenAI, represents a transformative leap forward. This paper delves into the multifaceted role of ChatGPT in revolutionizing scholarly endeavors beyond mere plagiarism detection. We explore how ChatGPT facilitates research collaboration, streamlines literature reviews, and assists in proper citation practices. By harnessing ChatGPT's contextual understanding and vast knowledge repository, social work researchers can unlock new avenues of creativity and efficiency in knowledge acquisition and dissemination. Moreover, this paper discusses the ethical considerations surrounding the integration of AI in academia and underscores the need for guidelines and education to ensure responsible usage. Ultimately, ChatGPT stands at the forefront of a technological revolution, empowering social work researchers to push the boundaries of knowledge acquisition and dissemination in unprecedented ways. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Using Sentiment Analysis to Understand Public Policy Nicknames: Obamacare and the Affordable Care Act.

Author: Lappeman, James, Goder, Aliah, Naicker, Kalencia, Faruki, Hamza, and Gordon, Patrick
Subjects: *SENTIMENT analysis, *SOCIAL media, *GOVERNMENT policy, *NICKNAMES, PATIENT Protection & Affordable Care Act
Abstract: In this study, we compared the social media net sentiment of one policy with two names. Specifically, we analyzed Obamacare and the Affordable Care Act (ACA) to understand how social media users engaged with each term on social media from March 2010 to March 2017. The net sentiment was measured with a sample of over 50 million micro-blogs, and the analysis was done using a combination of digital instruments and human validation. We found a significant difference between the social media engagement and sentiment of both terms, with the ACA performing significantly better than Obamacare, despite Obamacare's higher conversation volume. With the ACA having an average of 26% less negative sentiment than Obamacare, the findings of this study emphasize the need to be careful when attaching nicknames to public policy. The findings also have implications for policymakers and politicians. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Measuring the impact of climate risk on renewable energy stock volatility: A case study of G20 economies.

Author: Zhang, Li, Liang, Chao, Huynh, Luu Duc Toan, Wang, Lu, and Damette, Olivier
Subjects: *RENEWABLE energy sources, *GREENHOUSE gas mitigation, *VOLATILITY (Securities), *MARKET volatility, *NATURAL language processing, *ALTERNATIVE fuels, *ENERGY shortages, GROUP of Twenty countries
Abstract: The contemporary world faces significant challenges in energy crises and climate change. To analyze the relationship between energy and climate, we explore the influence of the climate-related attention of G20 countries on renewable energy stock volatility forecasting under the framework of the extended GARCH-MIDAS model. In the context of COP26, we further adopt natural language processing technology and shrinkage approach to obtain Google search volume for 107 climate-related keywords and then construct new climate risk attention indicators. The in-sample parameter estimation results show that the climate attention of G20 countries has a remarkable positive effect on the renewable energy stock market volatility. The out-of-sample results demonstrate that the climate attention of different countries exerts varying influences on the volatility of the renewable energy stock market. Climate risk and energy issues are among the serious challenges facing the 21st century, and reducing greenhouse gas emissions and finding cleaner energy is an urgent task. As the response to climate change necessitates diverse strategies in various countries, our research can offer valuable guidance and serve as a reference for national energy transitions and the selection of alternative energy solutions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Measuring Alliance and Symptom Severity in Psychotherapy Transcripts Using Bert Topic Modeling.

Author: Lalk, Christopher, Steinbrenner, Tobias, Kania, Weronika, Popko, Alexander, Wester, Robin, Schaffrath, Jana, Eberhardt, Steffen, Schwartz, Brian, Lutz, Wolfgang, and Rubel, Julian
Subjects: *PSYCHOTHERAPY, *SPEECH therapists, *THERAPEUTIC alliance, *ARTIFICIAL intelligence, *MACHINE learning
Abstract: We aim to use topic modeling, an approach for discovering clusters of related words ("topics"), to predict symptom severity and therapeutic alliance in psychotherapy transcripts, while also identifying the most important topics and overarching themes for prediction. We analyzed 552 psychotherapy transcripts from 124 patients. Using BERTopic (Grootendorst, 2022), we extracted 250 topics each for patient and therapist speech. These topics were used to predict symptom severity and alliance with various competing machine-learning methods. Sensitivity analyses were calculated for a model based on 50 topics, LDA-based topic modeling, and a bigram model. Additionally, we grouped topics into themes using qualitative analysis and identified key topics and themes with eXplainable Artificial Intelligence (XAI). Symptom severity could be predicted with highest accuracy by patient topics (r =0.45, 95%-CI 0.40, 0.51), whereas alliance was better predicted by therapist topics (r =0.20, 95%-CI 0.16, 0.24). Drivers for symptom severity were themes related to health and negative experiences. Lower alliance was correlated with various themes, especially psychotherapy framework, income, and everyday life. This analysis shows the potential of using topic modeling in psychotherapy research allowing to predict several treatment-relevant metrics with reasonable accuracy. Further, the use of XAI allows for an analysis of the individual predictive value of topics and themes. Limitations entail heterogeneity across different topic modeling hyperparameters and a relatively small sample size. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. FDT − Dr2T: a unified Dense Radiology Report Generation Transformer framework for X-ray images.

Author: Sharma, Dhruv, Dhiman, Chhavi, and Kumar, Dinesh
Abstract: Medical Image Captioning (MIC), is a developing area of artificial intelligence that combines two main research areas, computer vision and natural language processing. In order to support clinical workflows and decision-making, MIC is used in a variety of applications pertaining to diagnosis, therapy, report production, and computer-aided diagnosis. The generation of long and coherent reports highlighting correct abnormalities is a challenging task. Therefore, in this direction, this paper presents an efficient F D T - D r 2 T framework for the generation of coherent radiology reports with efficient exploitation of medical content. The proposed framework leverages the fusion of texture features and deep features in the first stage by incorporating ISCM-LBP + PCA-HOG feature extraction algorithm and Convolutional Triple Attention-based Efficient XceptionNet ( C - T a X N e t ). Further, fused features from the FDT module are utilized by the Dense Radiology Report Generation Transformer ( D r 2 T ) model with modified multi-head attention generating dense radiology reports by highlighting specific crucial abnormalities. To evaluate the performance of the proposed F D T - D r 2 T extensive experiments are conducted on publicly available IU Chest X-ray dataset and the best performance of the work is observed as 0.531 BLEU@1, 0.398 BLEU@2, 0.322 BLEU@3, 0.251 BLEU@4, 0.384 CIDEr, 0.506 ROUGE-L, 0.277 METEOR. An ablation study is carried out to support the experiments. Overall, the results obtained demonstrate the efficiency and efficacy of the proposed framework. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. AI assistance for coaches and therapists.

Author: Blyler, Abigail P. and Seligman, Martin E. P.
Subjects: *PSYCHOTHERAPISTS, *DECISION support systems, *CONSCIOUSNESS, *MEDICAL technology, *ARTIFICIAL intelligence, *PSYCHOTHERAPIST attitudes, *COACHES (Athletics), *NATURAL language processing, *DESCRIPTIVE statistics, *RESEARCH methodology, *SOCIAL support, *PSYCHOSOCIAL factors, *USER interfaces
Abstract: We found previously that ChatGPT-4 could use 50 stream-of-consciousness thoughts to make the latent construct of narrative identity explicit. We now demonstrate this as a tool for interventions by coaches and therapists. Using five narrative identities, ChatGPT-4 recommended actionable strategies and interventions tailored to the narrative identity. Artificial intelligence (AI) can thus support coaches and therapists by crafting personalized approaches drawing on the person's narrative identity. This new assistive tool may help clients achieve greater insight, growth and well-being. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. Evaluating artificial intelligence responses to respiratory medicine questions.

Author: Luo, Hong, Yan, Jisong, and Zhou, Xia
Subjects: *ARTIFICIAL intelligence, *ACINETOBACTER infections, *CHATGPT, *NATURAL language processing, *MEDICAL terminology
Abstract: This article evaluates the use of an artificial intelligence tool called ChatGPT in the field of respiratory medicine. ChatGPT is designed to provide human-like responses to a wide range of topics, including medical inquiries. However, there are concerns about its accuracy and reliability in the medical sector, as it has not been extensively trained with biomedical datasets or vetted by medical professionals. The study found that ChatGPT provided correct answers for 63.5% of respiratory medicine questions, with higher accuracy in basic medical knowledge and lower accuracy in treatment and management. It is important to note that ChatGPT should not replace professional medical advice and should be used with caution. The article highlights the limitations of ChatGPT and suggests that it can be used as a supplementary educational tool under expert guidance. Future research should focus on improving the model's accuracy and reliability in specialized domains like respiratory medicine. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

32. Fusion of Semantic, Visual and Network Information for Detection of Misinformation on Social Media.

Author: Ahuja, Nishtha and Kumar, Shailender
Abstract: Fake news is the most prevailing buzzword in today's world. Fake news detection is a very time-consuming task that requires fact-checking either manually or automatically using machine learning techniques. The existing techniques focus only on the textual and image content. In this article, we have proposed a multimodal model—FakeMine for the mining of fake news content on social media. Our model explores the network structure of social media posts using Graph Neural Networks and combines them with semantic information from text and images in order to attain better accuracy. Our proposed work uses BERT for textual representations while preserving the semantic relationships in news articles. The image features are represented using VGG-19. The propagation structure of the circulating fake news is captured using Graph Neural Networks. All the features computed are fused together for a better classification. For classification, we have used LSTM which is optimized using Chimp Optimization. The FakeMine model was able to achieve an accuracy of 97.65 which exceeded all other baseline models for multiple modalities. It performed better than other models when tested on individual modalities as well. The proposed optimized LSTM classifier was also able to perform better than all other baseline classifiers. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Multimodal and immersive systems for skills development and education.

Author: Di Mitri, Daniele, Limbu, Bibeg, Schneider, Jan, Iren, Deniz, Giannakos, Michail, and Klemke, Roland
Subjects: *MULTIMODAL user interfaces, *SELF-regulated learning, *NATURAL language processing, *GROUP problem solving, *EDUCATIONAL psychology, *MACHINE learning
Abstract: This article provides an overview of the advancements in immersive technologies, such as mixed reality, virtual reality, and augmented reality, and their potential for enhancing skills development and education. It discusses the concept of multimodal and immersive learning systems, which combine different modes of interaction with artificial intelligence to create personalized learning experiences. The article also introduces a special section that showcases research on multimodal and immersive systems in various domains, including collaboration, mathematics, reading comprehension, art appreciation, and skills development. The papers in the special section highlight the diversity of applications and the impact of these technologies on different skills and learning scenarios, while also identifying gaps and challenges for future research. Overall, the findings contribute to our understanding of the potential for enhanced learning experiences and personalized skill development through multimodal and immersive technologies. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

34. Towards automated transcribing and coding of embodied teamwork communication through multimodal learning analytics.

Author: Zhao, Linxuan, Gašević, Dragan, Swiecki, Zachari, Li, Yuheng, Lin, Jionghao, Sha, Lele, Yan, Lixiang, Alfredo, Riordan, Li, Xinyu, and Martinez‐Maldonado, Roberto
Subjects: *RAPID response teams, *NATURAL language processing, *AUTOMATIC speech recognition, *LEARNING, *CLASSROOM environment
Abstract: Effective collaboration and teamwork skills are critical in high‐risk sectors, as deficiencies in these areas can result in injuries and risk of death. To foster the growth of these vital skills, immersive learning spaces have been created to simulate real‐world scenarios, enabling students to safely improve their teamwork abilities. In such learning environments, multiple dialogue segments can occur concurrently as students independently organise themselves to tackle tasks in parallel across diverse spatial locations. This complex situation creates challenges for educators in assessing teamwork and for students in reflecting on their performance, especially considering the importance of effective communication in embodied teamwork. To address this, we propose an automated approach for generating teamwork analytics based on spatial and speech data. We illustrate this approach within a dynamic, immersive healthcare learning environment centred on embodied teamwork. Moreover, we evaluated whether the automated approach can produce transcriptions and epistemic networks of spatially distributed dialogue segments with a quality comparable to those generated manually for research objectives. This paper makes two key contributions: (1) it proposes an approach that integrates automated speech recognition and natural language processing techniques to automate the transcription and coding of team communication and generate analytics; and (2) it provides analyses of the errors in outputs generated by those techniques, offering insights for researchers and practitioners involved in the design of similar systems.Practitioner notesWhat is currently known about this topic Immersive learning environments simulate real‐world situations, helping students improve their teamwork skills.In these settings, students can have multiple simultaneous conversations while working together on tasks at different physical locations.The dynamic nature of these interactions makes it hard for teachers to assess teamwork and communication and for students to reflect on their performance.What this paper adds We propose a method that employs multimodal learning analytics for automatically generating teamwork‐related insights into the content of student conversations.This data processing method allows for automatically transcribing and coding spatially distributed dialogue segments generated from students working in teams in an immersive learning environment and enables downstream analysis.This approach uses spatial analytics, natural language processing and automated speech recognition techniques.Implications for practitioners Automated coding of dialogue segments among team members can help create analytical tools to assist in evaluating and reflecting on teamwork.By analysing spatial and speech data, it is possible to apply learning analytics advancements to support teaching and learning in fast‐paced physical learning spaces where students can freely engage with one another. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. The Surgical Clerkship in the COVID Era: A Natural Language Processing and Thematic Analysis.

Author: Howell, Thomas Clark, Ladowski, Joseph M., Nash, Amanda, Rhodin, Kristen E., Tracy, Elisabeth T., Migaly, John, Bloom, Diane, and Vatsaas, Cory J.
Subjects: *NATURAL language processing, *COVID-19 pandemic, *THEMATIC analysis, *MEDICAL students, *SURGICAL education, *MEDICAL education, *FORMATIVE tests
Abstract: Responses to COVID-19 within medical education prompted significant changes to the surgical clerkship. We analyzed the changes in medical student end of course feedback before and after the COVID-19 outbreak. Postclerkship surveys from 2017 to 2022 were analyzed including both Likert scale data and free text, excluding the COVID outbreak year 2019-2020. Likert scale questions were compared between pre-COVID (2017-2019) and COVID-era cohorts (2020-2022) with the Mann–Whitney U -test. Free-text comments were analyzed using both thematic analysis and natural language processing including sentiment, word and phrase frequency, and topic modeling. Of the 483 medical students surveyed from 2017 to 2022, 297 responded (61% response rate) to the included end of clerkship surveys. Most medical students rated the clerkship above average or excellent with no significant difference between the pre-COVID and COVID-era cohorts (70.4% Versus 64.8%, P = 0.35). Perception of grading expectations did significantly differ, 51% of pre-COVID students reported clerkship grading standards were almost always clear compared to 27.5% of COVID-era students (P = 0.01). Pre-COVID cohorts more frequently mentioned learning and feedback while COVID-era cohorts more frequently mentioned case, attending, and expectation. Natural language processing topic modeling and formal thematic analysis identified similar themes: team, time, autonomy, and expectations. COVID-19 presented many challenges to undergraduate medical education. Despite many changes, there was no significant difference in clerkship satisfaction ratings. Unexpectedly, the greater freedom and autonomy of asynchronous lectures and choice of cases became a highlight of the new curriculum. Future research should investigate if there are similar associations nationally with a multi-institutional study. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. Assessing ChatGPT's Responses to Otolaryngology Patient Questions.

Author: Carnino, Jonathan M., Pellegrini, William R., Willis, Megan, Cohen, Michael B., Paz-Lansberg, Marianella, Davis, Elizabeth M., Grillone, Gregory A., and Levi, Jessica R.
Subjects: *CROSS-sectional method, *EMPATHY, *MEDICAL specialties & specialists, *PATIENT safety, *DATA analysis, *ARTIFICIAL intelligence, *NATURAL language processing, *STATISTICS, *USER interfaces, *EVALUATION
Abstract: Objective: This study aims to evaluate ChatGPT's performance in addressing real-world otolaryngology patient questions, focusing on accuracy, comprehensiveness, and patient safety, to assess its suitability for integration into healthcare. Methods: A cross-sectional study was conducted using patient questions from the public online forum Reddit's r/AskDocs, where medical advice is sought from healthcare professionals. Patient questions were input into ChatGPT (GPT-3.5), and responses were reviewed by 5 board-certified otolaryngologists. The evaluation criteria included difficulty, accuracy, comprehensiveness, and bedside manner/empathy. Statistical analysis explored the relationship between patient question characteristics and ChatGPT response scores. Potentially dangerous responses were also identified. Results: Patient questions averaged 224.93 words, while ChatGPT responses were longer at 414.93 words. The accuracy scores for ChatGPT responses were 3.76/5, comprehensiveness scores were 3.59/5, and bedside manner/empathy scores were 4.28/5. Longer patient questions did not correlate with higher response ratings. However, longer ChatGPT responses scored higher in bedside manner/empathy. Higher question difficulty correlated with lower comprehensiveness. Five responses were flagged as potentially dangerous. Conclusion: While ChatGPT exhibits promise in addressing otolaryngology patient questions, this study demonstrates its limitations, particularly in accuracy and comprehensiveness. The identification of potentially dangerous responses underscores the need for a cautious approach to AI in medical advice. Responsible integration of AI into healthcare necessitates thorough assessments of model performance and ethical considerations for patient safety. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Social media and social impact assessment: Evolving methods in a shifting context.

Author: Sherren, Kate, Chen, Yan, Mohammadi, Mehrnoosh, Zhao, Qiqi, Gone, Keshava Pallavi, Rahman, HM Tuihedur, and Smit, Michael
Subjects: *SOCIAL impact assessment, *SOCIAL media, *NATURAL language processing, *WETLAND restoration, *DAM retirement
Abstract: Among many by-products of Web 2.0 come the wide range of potential image and text datasets within social media and content sharing platforms that speak of how people live, what they do, and what they care about. These datasets are imperfect and biased in many ways, but those flaws make them complementary to data derived from conventional social science methods and thus potentially useful for triangulation in complex decision-making contexts. Yet the online environment is highly mutable, and so the datasets are less reliable than censuses or other standard data types leveraged in social impact assessment. Over the past decade, we have innovated numerous methods for deploying Instagram datasets in investigating management or development alternatives. This article synthesizes work from three Canadian decision contexts – hydroelectric dam construction or removal; dyke realignment or wetland restoration; and integrating renewable energy into vineyard landscapes – to illustrate some of the methods we have applied to social impact assessment questions using Instagram that may be transferrable to other social media platforms and contexts: thematic (manual coding, machine vision, natural language processing/sentiment analysis, statistical analysis), spatial (hotspot mapping, cultural ecosystem modeling), and visual (word clouds, saliency mapping, collage). We conclude with a set of cautions and next steps for the domain. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. Extractive text summarization for biomedical transcripts using deep dense LSTM‐CNN framework.

Author: Bedi, Parminder Pal Singh, Bala, Manju, and Sharma, Kapil
Subjects: *AUTOMATIC summarization, *TEXT summarization, *MACHINE learning, *CONVOLUTIONAL neural networks, *DEEP learning, *NATURAL language processing, *NEUROLINGUISTICS
Abstract: The most recent and precise biological and healthcare knowledge is critical in the current outbreak such as COVID. In today's small world, everyone needs timely and appropriate medical information to prevent contagious diseases. Extraction of important information from medical conversations and dissemination to patients and doctors may benefit in the treatments of doctor tiredness and patient amnesia. Problem: Automatic text summarizing is essential for gaining great knowledge in any topic in an efficient and productive manner. The material included in health records is vital to our understanding of kind illness and its manifestations. Creating a comprehensive and standard kind of content is becoming an unavoidable and crucial problem in the medical process as a result of the massive amounts of fragmented data created in many sectors. Approach: The purpose of this study is to employ NLP‐based deep learning algorithms for text summary that perform well on linguistic text summarization data, and then modify/adapt these for biomedical domain‐specific text summarization. This paper provides an approach developed in‐house for condensing ill‐punctuated or unpunctuated discussion transcripts into more intelligible summaries, which combines topic modelling and phrase selection with punctuation restorations. For autonomous synthesis of medical reports from biomedical transcripts, this research proposes using an end‐to‐end summarization technique, Deep Dense Long Short Term Memory Network (LSTM), followed by Convolutional Neural Network (CNN). Results: Extensive testing, examination, and comparing have demonstrated that this summarizer works well for medical transcript summarization. The suggested approach achieved an average ROUGE score of 93.5% using a single document summary. Furthermore, by comparing new techniques to previous ones, the utility and accuracy of novel strategies would be shown. The results reveal that models trained on ordinary language provide comparable results on a biomedical testing set, with one model outperforming the linguistic test set. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. Code‐mixed Hindi‐English text correction using fuzzy graph and word embedding.

Author: Jain, Minni, Jindal, Rajni, and Jain, Amita
Subjects: *NATURAL language processing, *SOCIAL media, *FUZZY graphs, *SENTIMENT analysis, *SPELLING errors
Abstract: Interaction via social media involves frequent code‐mixed text, spelling errors and noisy elements, which creates a bottleneck in the performance of natural language processing applications. This proposed work is the first approach for code‐mixed Hindi‐English social media text that comprises language identification, detection and correction of non‐word (Out of Vocabulary) errors as well as real‐word errors occurring simultaneously. Each identified language (Devanagari Hindi, Roman Hindi, and English) has its own complexities and challenges. Errors are detected individually for each language and a suggestive list of the erroneous words is created. After this, a fuzzy graph between different words of the suggestive lists is generated using various semantic relations in Hindi WordNet. Word embeddings and Fuzzy graph‐based centrality measures are used to find the correct word. Several experiments are performed on different social media datasets taken from Instagram, Twitter, YouTube comments, Blogs, and WhatsApp. The experimental results demonstrate that the proposed system corrects out‐of‐vocabulary words as well as real‐word errors with a maximum recall of 0.90 and 0.67, respectively for Dev_Hindi and 0.87 and 0.66, respectively for Rom_Hindi. The proposed method is also applied for state‐of‐art sentiment analysis approaches where the F1‐score has been visibly improved. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. HOTSPOT: An ad hoc teamwork platform for mixed human-robot teams.

Author: Ribeiro, João G., Henriques, Luis Müller, Colcher, Sérgio, Duarte, Julio Cesar, Melo, Francisco S., Milidiú, Ruy Luiz, and Sardinha, Alberto
Subjects: *MOBILE robots, *NATURAL language processing, *AD hoc computer networks, *MULTIAGENT systems, *GROUP work in research, *TEAMS
Abstract: Ad hoc teamwork is a research topic in multi-agent systems whereby an agent (the "ad hoc agent") must successfully collaborate with a set of unknown agents (the "teammates") without any prior coordination or communication protocol. However, research in ad hoc teamwork is predominantly focused on agent-only teams, but not on agent-human teams, which we believe is an exciting research avenue and has enormous application potential in human-robot teams. This paper will tap into this potential by proposing HOTSPOT, the first framework for ad hoc teamwork in human-robot teams. Our framework comprises two main modules, addressing the two key challenges in the interaction between a robot acting as the ad hoc agent and human teammates. First, a decision-theoretic module that is responsible for all task-related decision-making (task identification, teammate identification, and planning). Second, a communication module that uses natural language processing to parse all communication between the robot and the human. To evaluate our framework, we use a task where a mobile robot and a human cooperatively collect objects in an open space, illustrating the main features of our framework in a real-world task. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. Identifying stigmatizing language in clinical documentation: A scoping review of emerging literature.

Author: Barcelona, Veronica, Scharp, Danielle, Idnay, Betina R., Moen, Hans, Cato, Kenrick, and Topaz, Maxim
Subjects: *CINAHL database, *LITERATURE reviews, *HEALTH services accessibility, *NATURAL language processing
Abstract: Background: Racism and implicit bias underlie disparities in health care access, treatment, and outcomes. An emerging area of study in examining health disparities is the use of stigmatizing language in the electronic health record (EHR). Objectives: We sought to summarize the existing literature related to stigmatizing language documented in the EHR. To this end, we conducted a scoping review to identify, describe, and evaluate the current body of literature related to stigmatizing language and clinician notes. Methods: We searched PubMed, Cumulative Index of Nursing and Allied Health Literature (CINAHL), and Embase databases in May 2022, and also conducted a hand search of IEEE to identify studies investigating stigmatizing language in clinical documentation. We included all studies published through April 2022. The results for each search were uploaded into EndNote X9 software, de-duplicated using the Bramer method, and then exported to Covidence software for title and abstract screening. Results: Studies (N = 9) used cross-sectional (n = 3), qualitative (n = 3), mixed methods (n = 2), and retrospective cohort (n = 1) designs. Stigmatizing language was defined via content analysis of clinical documentation (n = 4), literature review (n = 2), interviews with clinicians (n = 3) and patients (n = 1), expert panel consultation, and task force guidelines (n = 1). Natural language processing was used in four studies to identify and extract stigmatizing words from clinical notes. All of the studies reviewed concluded that negative clinician attitudes and the use of stigmatizing language in documentation could negatively impact patient perception of care or health outcomes. Discussion: The current literature indicates that NLP is an emerging approach to identifying stigmatizing language documented in the EHR. NLP-based solutions can be developed and integrated into routine documentation systems to screen for stigmatizing language and alert clinicians or their supervisors. Potential interventions resulting from this research could generate awareness about how implicit biases affect communication patterns and work to achieve equitable health care for diverse populations. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. Meta-2OM: A multi-classifier meta-model for the accurate prediction of RNA 2′-O-methylation sites in human RNA.

Author: Harun-Or-Roshid, Md., Pham, Nhat Truong, Manavalan, Balachandran, and Kurata, Hiroyuki
Subjects: *TRANSFER RNA, *INTERNET servers, *RNA modification & restriction, *MACHINE learning, *RNA, *NATURAL language processing, *SMALL nuclear RNA, *NATURAL immunity
Abstract: 2′-O-methylation (2-OM or Nm) is a widespread RNA modification observed in various RNA types like tRNA, mRNA, rRNA, miRNA, piRNA, and snRNA, which plays a crucial role in several biological functional mechanisms and innate immunity. To comprehend its modification mechanisms and potential epigenetic regulation, it is necessary to accurately identify 2-OM sites. However, biological experiments can be tedious, time-consuming, and expensive. Furthermore, currently available computational methods face challenges due to inadequate datasets and limited classification capabilities. To address these challenges, we proposed Meta-2OM, a cutting-edge predictor that can accurately identify 2-OM sites in human RNA. In brief, we applied a meta-learning approach that considered eight conventional machine learning algorithms, including tree-based classifiers and decision boundary-based classifiers, and eighteen different feature encoding algorithms that cover physicochemical, compositional, position-specific and natural language processing information. The predicted probabilities of 2-OM sites from the baseline models are then combined and trained using logistic regression to generate the final prediction. Consequently, Meta-2OM achieved excellent performance in both 5-fold cross-validation training and independent testing, outperforming all existing state-of-the-art methods. Specifically, on the independent test set, Meta-2OM achieved an overall accuracy of 0.870, sensitivity of 0.836, specificity of 0.904, and Matthew's correlation coefficient of 0.743. To facilitate its use, a user-friendly web server and standalone program have been developed and freely available at http://kurata35.bio.kyutech.ac.jp/Meta-2OM and https://github.com/kuratahiroyuki/Meta-2OM. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. iProL: identifying DNA promoters from sequence information based on Longformer pre-trained model.

Author: Peng, Binchao, Sun, Guicong, and Fan, Yongxian
Subjects: *NATURAL language processing, *MOLECULAR biology, *MOLECULAR genetics, *NUCLEOTIDE sequence, *ESCHERICHIA coli
Abstract: Promoters are essential elements of DNA sequence, usually located in the immediate region of the gene transcription start sites, and play a critical role in the regulation of gene transcription. Its importance in molecular biology and genetics has attracted the research interest of researchers, and it has become a consensus to seek a computational method to efficiently identify promoters. Still, existing methods suffer from imbalanced recognition capabilities for positive and negative samples, and their recognition effect can still be further improved. We conducted research on E. coli promoters and proposed a more advanced prediction model, iProL, based on the Longformer pre-trained model in the field of natural language processing. iProL does not rely on prior biological knowledge but simply uses promoter DNA sequences as plain text to identify promoters. It also combines one-dimensional convolutional neural networks and bidirectional long short-term memory to extract both local and global features. Experimental results show that iProL has a more balanced and superior performance than currently published methods. Additionally, we constructed a novel independent test set following the previous specification and compared iProL with three existing methods on this independent test set. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. BAOS-CNN: A novel deep neuroevolution algorithm for multispecies seagrass detection.

Author: Noman, Md Kislu, Shamsul Islam, Syed Mohammed, Jafar Jalali, Seyed Mohammad, Abu-Khalaf, Jumana, and Lavery, Paul
Subjects: *DEEP learning, *NATURAL language processing, *OPTIMIZATION algorithms, *ARCHITECTURAL engineering, *METAHEURISTIC algorithms, *COMPUTER vision, *SEAGRASSES
Abstract: Deep learning, a subset of machine learning that utilizes neural networks, has seen significant advancements in recent years. These advancements have led to breakthroughs in a wide range of fields, from natural language processing to computer vision, and have the potential to revolutionize many industries or organizations. They have also demonstrated exceptional performance in the identification and mapping of seagrass images. However, these deep learning models, particularly the popular Convolutional Neural Networks (CNNs) require architectural engineering and hyperparameter tuning. This paper proposes a Deep Neuroevolutionary (DNE) model that can automate the architectural engineering and hyperparameter tuning of CNNs models by developing and using a novel metaheuristic algorithm, named 'Boosted Atomic Orbital Search (BAOS)'. The proposed BAOS is an improved version of the recently proposed Atomic Orbital Search (AOS) algorithm which is based on the principle of atomic model and quantum mechanics. The proposed algorithm leverages the power of the Lévy flight technique to boost the performance of the AOS algorithm. The proposed DNE algorithm (BAOS-CNN) is trained, evaluated and compared with six popular optimisation algorithms on a patch-based multi-species seagrass dataset. This proposed BAOS-CNN model achieves the highest overall accuracy (97.48%) among the seven evolutionary-based CNN models. The proposed model also achieves the state-of-the-art overall accuracy of 92.30% and 93.5% on the publicly available four classes and five classes version of the 'DeepSeagrass' dataset, respectively. This multi-species seagrass dataset is available at: https://ro.ecu.edu.au/datasets/141/. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. Effects of working memory and task type on syntactic complexity in EFL learners’ writing.

Author: Jiang, Lei, Abbuhl, Rebekha, and Fu, Yv
Subjects: *NATURAL language processing, *MEMORY span, *SHORT-term memory, *ENGLISH as a foreign language, *CHINESE-speaking students, *LISTENING comprehension, *ADULT students
Abstract: This study investigated the predictive power of working memory and task type for syntactic complexity in EFL adult learners’ academic writing. One hundred forty-eight Chinese adult students were recruited as participants. Their working memory was assessed with an operation span task, a set of digit span tasks, and a symmetry span task. The syntactic complexity of their written products from two different TOEFL iBT writing tasks, an integrated writing task and an independent writing task, was measured using a natural language processing tool. Results showed a significant positive association between operation span and coordination in the students’ written products. In addition, a significant difference was found between the integrated task and the independent task with respect to phrasal complexity, with the integrated task eliciting more complex nominals per clause than the independent task. No significant effects were identified for other components of working memory or other measures of syntactic complexity. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. End-to-end pseudonymization of fine-tuned clinical BERT models: Privacy preservation with maintained data utility.

Author: Vakili, Thomas, Henriksson, Aron, and Dalianis, Hercules
Subjects: *LANGUAGE models, *DATA privacy, *PRIVACY, *NATURAL language processing
Abstract: Many state-of-the-art results in natural language processing (NLP) rely on large pre-trained language models (PLMs). These models consist of large amounts of parameters that are tuned using vast amounts of training data. These factors cause the models to memorize parts of their training data, making them vulnerable to various privacy attacks. This is cause for concern, especially when these models are applied in the clinical domain, where data are very sensitive. Training data pseudonymization is a privacy-preserving technique that aims to mitigate these problems. This technique automatically identifies and replaces sensitive entities with realistic but non-sensitive surrogates. Pseudonymization has yielded promising results in previous studies. However, no previous study has applied pseudonymization to both the pre-training data of PLMs and the fine-tuning data used to solve clinical NLP tasks. This study evaluates the effects on the predictive performance of end-to-end pseudonymization of Swedish clinical BERT models fine-tuned for five clinical NLP tasks. A large number of statistical tests are performed, revealing minimal harm to performance when using pseudonymized fine-tuning data. The results also find no deterioration from end-to-end pseudonymization of pre-training and fine-tuning data. These results demonstrate that pseudonymizing training data to reduce privacy risks can be done without harming data utility for training PLMs. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. Digital cloning of online social networks for language-sensitive agent-based modeling of misinformation spread.

Author: Puri, Prateek, Hassler, Gabriel, Katragadda, Sai, and Shenk, Anton
Subjects: *ONLINE social networks, *GENERATIVE artificial intelligence, *VIRTUAL communities, *NATURAL language processing, *MISINFORMATION, *SYNTHETIC fuels, *SOCIAL networks
Abstract: We develop a simulation framework for studying misinformation spread within online social networks that blends agent-based modeling and natural language processing techniques. While many other agent-based simulations exist in this space, questions over their fidelity and generalization to existing networks in part hinder their ability to drive policy-relevant decision making. To partially address these concerns, we create a 'digital clone' of a known misinformation sharing network by downloading social media histories for over ten thousand of its users. We parse these histories to both extract the structure of the network and model the nuanced ways in which information is shared and spread among its members. Unlike many other agent-based methods in this space, information sharing between users in our framework is sensitive to topic of discussion, user preferences, and online community dynamics. To evaluate the fidelity of our method, we seed our cloned network with a set of posts recorded in the base network and compare propagation dynamics between the two, observing reasonable agreement across the twin networks over a variety of metrics. Lastly, we explore how the cloned network may serve as a flexible, low-cost testbed for misinformation countermeasure evaluation and red teaming analysis. We hope the tools explored here augment existing efforts in the space and unlock new opportunities for misinformation countermeasure evaluation, a field that may become increasingly important to consider with the anticipated rise of misinformation campaigns fueled by generative artificial intelligence. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. A BERT-based pretraining model for extracting molecular structural information from a SMILES sequence.

Author: Zheng, Xiaofan and Tomiura, Yoichi
Subjects: *ARTIFICIAL neural networks, *LANGUAGE models, *SMILING, *MOLECULAR structure, *MACHINE learning, *NATURAL language processing
Abstract: Among the various molecular properties and their combinations, it is a costly process to obtain the desired molecular properties through theory or experiment. Using machine learning to analyze molecular structure features and to predict molecular properties is a potentially efficient alternative for accelerating the prediction of molecular properties. In this study, we analyze molecular properties through the molecular structure from the perspective of machine learning. We use SMILES sequences as inputs to an artificial neural network in extracting molecular structural features and predicting molecular properties. A SMILES sequence comprises symbols representing molecular structures. To address the problem that a SMILES sequence is different from actual molecular structural data, we propose a pretraining model for a SMILES sequence based on the BERT model, which is widely used in natural language processing, such that the model learns to extract the molecular structural information contained in the SMILES sequence. In an experiment, we first pretrain the proposed model with 100,000 SMILES sequences and then use the pretrained model to predict molecular properties on 22 data sets and the odor characteristics of molecules (98 types of odor descriptor). The experimental results show that our proposed pretraining model effectively improves the performance of molecular property prediction Scientific contribution: The 2-encoder pretraining is proposed by focusing on the lower dependency of symbols to the contextual environment in a SMILES than one in a natural language sentence and the corresponding of one compound to multiple SMILES sequences. The model pretrained with 2-encoder shows higher robustness in tasks of molecular properties prediction compared to BERT which is adept at natural language. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. Innovation trends and evolutionary paths of green fuel technologies in maritime field: A global patent review.

Author: Sun, Minghan, Tong, Tong, Jiang, Man, and Zhu, Jewel X.
Subjects: *GREEN fuels, *HYDROGEN as fuel, *FUEL cells, *HYBRID power systems, *GREEN technology, *NATURAL language processing, *ALTERNATIVE fuels
Abstract: As global environmental issues become increasingly prominent, the maritime industry faces an urgent imperative to curtail carbon emissions and mitigate environmental impact. Green marine alternative fuels are actively addressing these challenges. Global patent data are collected, and four sub-technologies of green fuel technologies in the maritime field are extracted based on Natural Language Processing. To evaluate the innovation trends and evolutionary paths of these technologies, the Main Path Analysis is employed to identify the evolution path of technologies and the evolution situation of major innovative entities, while the Social Network Analysis was used to present the results visually. These four sub-technologies (Hydrogen & Fuel Cell, Methanol & Ethanol, Ammonia, and LNG & LPG) have occupied the mainstream of fuel technology in maritime over the past five years, with 34.6% of patents, 38.3% of patent citations, and 93.9% of technological influence, accounting for 27.4%, 20.5%, 16.7% and 44.3%, respectively. Ammonia, and Hydrogen & Fuel Cell are springing up like mushrooms after rain, while the development of LNG & LPG, Methanol & Ethanol is relatively mature. Each of these technologies showcases distinct developmental paradigms. We elaborated on developmental paradigms and future research priorities of each technology in the conclusion. In addition, new technological opportunities are also created through the cross-fertilization of technologies. In particular, integrated systems for hydrogen production, storage, and combustion on LNG ships, and maritime hybrid power systems driven by ammonia and hydrogen have opened a new window for the green development of maritime fuels. • Four green fuel technologies account for 93.9% of maritime fuel technological influence in the past five years. • Hydrogen & Fuel Cell and Ammonia rapidly expand, while LNG & LPG and Methanol & Ethanol are relatively mature. • This study details the unique development paradigms and future research priorities for each technology. • Hydrogen systems on LNG ships and hybrid ammonia-hydrogen maritime power open new avenues for green fuel development. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. Using AI-Based Virtual Companions to Assist Adolescents with Autism in Recognizing and Addressing Cyberbullying.

Author: Ferrer, Robinson, Ali, Kamran, and Hughes, Charles
Subjects: *CYBERBULLYING, *ARTIFICIAL intelligence, *VIRTUAL communities, *SOCIAL media, *SUICIDE risk factors, *ADOLESCENT obesity, *TEENAGERS, *TEENAGE girls, *AUTISM
Abstract: Social media platforms and online gaming sites play a pervasive role in facilitating peer interaction and social development for adolescents, but they also pose potential threats to health and safety. It is crucial to tackle cyberbullying issues within these platforms to ensure the healthy social development of adolescents. Cyberbullying has been linked to adverse mental health outcomes among adolescents, including anxiety, depression, academic underperformance, and an increased risk of suicide. While cyberbullying is a concern for all adolescents, those with disabilities are particularly susceptible and face a higher risk of being targets of cyberbullying. Our research addresses these challenges by introducing a personalized online virtual companion guided by artificial intelligence (AI). The web-based virtual companion's interactions aim to assist adolescents in detecting cyberbullying. More specifically, an adolescent with ASD watches a cyberbullying scenario in a virtual environment, and the AI virtual companion then asks the adolescent if he/she detected cyberbullying. To inform the virtual companion in real time to know if the adolescent has learned about detecting cyberbullying, we have implemented fast and lightweight cyberbullying detection models employing the T5-small and MobileBERT networks. Our experimental results show that we obtain comparable results to the state-of-the-art methods despite having a compact architecture. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Journal

Region

Database

Publisher

5,155 results on '"*NATURAL language processing"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources