19 results
Search Results
2. A Comparative Review of Sentimental Analysis Using Machine Learning and Deep Learning Approaches.
- Author
-
Nagelli, Archana and Saleena, B.
- Subjects
DEEP learning ,MACHINE learning ,NATURAL language processing ,ONLINE social networks ,SENTIMENT analysis ,DATA mining - Abstract
The sentiment data provides vital information about the feedback of the user's opinion, attitude and emotions. The business of product development and digital marketing teams entirely depends upon the outcome of these sentiments and they apply various Data Mining techniques, Machine Learning and Deep Learning approaches to analyse the depth of the dataset. The Sentiment Analysis provides the automatic data mining of reviews, comments, opinions and suggestions, received from various input methods, including text, audio notes, images and emoticons, through Natural Language Processing. The analysis assists in the classification of reviewer feedback in terms of positive, negative and neutral categories. In this study, the opinions shared by individuals over various social networking sites in the case of any big event, the release of any new product or show and political events were analysed. Machine Learning and Deep Learning techniques are discussed and used dominantly to illustrate the outcome of opinions and events. The accurate analysis of vast information shared by individuals free of cost and without any influence can provide vital information for organisations and management authorities. This review analyses various techniques in the field of Aspect-Based Sentiment Analysis along with their features and research scopes and thus, it helps researchers to focus on more precise works in the future. Among the machine learning algorithms, Random Forest performed much better as compared to other methods, and among the Deep Learning approaches, Multichannel CNN outperformed with the highest accuracy of 96.23%. The paper includes the comparative study of multiple Machine Learning and Deep Learning techniques for the evaluation of sentiment data and concludes with the challenges and scope of Sentiment Analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. Peaks, Slopes, Canyons and Plateaus: Identifying Technology Trends Throughout the Life Cycle.
- Author
-
Efimenko, Irina V. and Khoroshevsky, Vladimir F.
- Subjects
TECHNOLOGICAL innovations ,TRENDS ,TEXT files ,NATURAL language processing ,ALGORITHMS ,SEMANTICS ,DATA extraction ,DATA mining - Abstract
A novel domain-independent approach to technology trend monitoring is presented in the paper. It is based on the ontology of a technology trend, hype cycles methodology, and semantic indicators which provide evidence of a maturity level of a technology. This approach forms the basis for implementation of text-mining software tools. Algorithms behind these tools allow users to escape from getting too general or garbage results which make it impossible to identify promising technologies at early stages (early detection, weak signals). Besides, these algorithms provide high-quality results in extraction of complex multiword terms which correspond to technological concepts forming a trend. Methodology and software developed as a result of this study are applicable to various industries with minor adjustments and require no deep expert knowledge from a user. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
4. Natural Language Processing and Information Extraction in Biology.
- Author
-
Jun-ichi Tsujii and Limsoon Wong
- Subjects
NATURAL language processing ,DATA mining ,COMPUTATIONAL biology ,BIOLOGICAL databases ,BIOINFORMATICS - Published
- 2000
5. Technical Review: Sarcasm Detection Algorithms.
- Author
-
Yavanoglu, Uraz, Ibisoglu, Taha Yasin, and Wıcana, Setra Genyang
- Subjects
DATA mining ,PSYCHOLINGUISTICS ,MACHINE learning - Abstract
In this paper, we want to review one of the challenging problems for the opinion mining task, which is sarcasm detection. To be able to do that, many researchers tried to explore such properties in sarcasm like theories of sarcasm, syntactical properties, psycholinguistic of sarcasm, lexical feature, semantic properties, etc. Studies conducted within last 15 years have not only made progress in semantic features but have also shown increasing amounts of methods of analysis using a machine-learning approach to process data. Therefore, this paper will try to explain the most currently used methods to detect sarcasm. Lastly, we will present a result of our finding, which might help other researchers to gain a better result in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
6. Extraction of Meaningful Information from Unstructured Clinical Notes Using Web Scraping.
- Author
-
Varshini, K. Sukanya and Uthra, R. Annie
- Subjects
NATURAL language processing ,MACHINE learning ,DATA mining ,FEATURE selection ,RANDOM forest algorithms ,MEDICAL transcription - Abstract
In the medical field, the clinical notes taken by the doctor, nurse, or medical practitioner are considered to be one of the most important medical documents. These documents hold information regarding the patient including the patient's current condition, family history, disease, symptoms, medications, lab test reports, and other vital information. Despite these documents holding important information regarding the patients, they cannot be used as the data are unstructured. Organizing a huge amount of data without any mistakes is highly impossible for humans, so ignoring unstructured data is not advisable. Hence, to overcome this issue, the web scraping method is used to extract the clinical notes from the Medical Transcription (MT) samples which hold many transcripted clinical notes of various departments. In the proposed method, Natural Language Processing (NLP) is used to pre-process the data, and the variants of the Term Frequency-Inverse Document Frequency (TF-IDF)-based vector model are used for the feature selection, thus extracting the required data from the clinical notes. The performance measures including the accuracy, precision, recall and F1 score are used in the identification of disease, and the result obtained from the proposed system is compared with the best performing machine learning algorithms including the Logistic Regression, Multinomial Naive Bayes, Random Forest classifier and Linear SVC. The result obtained proves that the Random Forest Classifier obtained a higher accuracy of 90% when compared to the other algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. AN EFFICIENT OBJECT ORIENTED TEXT ANALYSIS (OOTA) APPROACH TO CONSTRUCT STATIC STRUCTURE WITH DYNAMIC BEHAVIOR.
- Author
-
EL-SAID, ASMAA M., ELDESOKY, ALI I., and ARAFAT, HESHAM A.
- Subjects
OBJECT-oriented methods (Computer science) ,ELECTRONIC data processing ,INFORMATION overload ,DATA mining ,MACHINE learning ,NATURAL language processing ,INFORMATION retrieval ,KNOWLEDGE management - Abstract
In many fields of science, IT applications and business environments successfully evolved systems to receive vast amount of electronic data and information. Due to increasing electronic data and information, most recent researches have tried to find a solution to resolve the crisis of information overload. These solutions include a combination of techniques of data mining, machine learning, natural language processing and information retrieval, information extraction, and knowledge management. A great challenge is how to exploit those information and knowledge resources and turn them into useful knowledge available to concerned people. The value of knowledge increases when people can share and capitalize on it. Thus, approaches that can help researchers to benefit from existing hidden knowledge are needed. For this, tools that can analyze, extract and explore relevant and useful information with relations are required. So, the main contribution of this paper is to integrate the technology of XML with text analysis for introducing an efficient concept-based structure model, where this model can represent the text in a form that can be easily understood, shared, managed and mined. This paper describes an efficient object oriented text analysis (OOTA) approach by generating an object oriented model that transforms unstructured text to a specific structured form and stored in XML format. The experimental results show that this approach has a good promotion on results. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
8. A NEW METHODOLOGY FOR DOMAIN ONTOLOGY CONSTRUCTION FROM THE WEB.
- Author
-
FRIKH, BOUCHRA, DJAANFAR, AHMED SAID, and OUHBI, BRAHIM
- Subjects
ONTOLOGIES (Information retrieval) ,WEB services ,NATURAL language processing ,INFORMATION retrieval ,INTERNET ,DATA mining ,INFORMATION resources ,ALGORITHMS - Abstract
Resources like ontologies are used in a number of applications, including natural language processing, information retrieval(especially from the Internet). Different methods have been proposed to build such resources. This paper proposes a new method to extract information from the Web to build a taxonomy of terms and Web resources for a given domain. Firstly, a (CHIR) method is used to identify candidat terms. Then a similarity (SIM) measure is introduced to select relevant concepts to build the ontology. Our new algorithm, called (CHIRSIM), is easy to implement and can be efficiently integrated into an information retrieval system to help improve the retrieval performance. Experimental results show that the proposed approach can effectively and efficiently construct a cancer domain ontology from unstructured text documents. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
9. A WORKFLOW FOR MUTATION EXTRACTION AND STRUCTURE ANNOTATION.
- Author
-
KANAGASABAI, RAJARAMAN, KHAR HENG CHOO, RANGANATHAN, SHOBA, and BAKER, CHRISTOPHER J. O.
- Subjects
INFORMATION retrieval ,TEXT mining ,DATA mining ,NATURAL language processing ,ELECTRONIC data processing ,ONTOLOGY - Abstract
Rich information on point mutation studies is scattered across heterogeneous data sources. This paper presents an automated workflow for mining mutation annotations from full-text biomedical literature using natural language processing (NLP) techniques as well as for their subsequent reuse in protein structure annotation and visualization. This system, called mSTRAP (Mutation extraction and STRucture Annotation Pipeline), is designed for both information aggregation and subsequent brokerage of the mutation annotations. It facilitates the coordination of semantically related information from a series of text mining and sequence analysis steps into a formal OWL-DL ontology. The ontology is designed to support application-specific data management of sequence, structure, and literature annotations that are populated as instances of object and data type properties. mSTRAPviz is a subsystem that facilitates the brokerage of structure information and the associated mutations for visualization. For mutated sequences without any corresponding structure available in the Protein Data Bank (PDB), an automated pipeline for homology modeling is developed to generate the theoretical model. With mSTRAP, we demonstrate a workable system that can facilitate automation of the workflow for the retrieval, extraction, processing, and visualization of mutation annotations — tasks which are well known to be tedious, time-consuming, complex, and error-prone. The ontology and visualization tool are available at . [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
10. Mining Semantics Structures from Syntactic Structures in Web Document Corpora.
- Author
-
Mousavi, Hamid, Gao, Shi, Kerr, Deirdre, Iseli, Markus, and Zaniolo, Carlo
- Subjects
WORLD Wide Web ,DATA mining ,SEMANTIC computing ,DATA management ,DATA science - Abstract
The Web is making possible many advanced text-mining applications, such as news summarization, essay grading, question answering, semantic search and structured queries on corpora of Web documents. For many of such applications, statistical text-mining techniques are of limited effectiveness since they do not utilize the morphological structure of the text. On the other hand, many approaches use NLP-based techniques that parse the text into parse trees, and then use patterns to mine and analyze parse trees which are often unnecessarily complex. To reduce this complexity and ease the entire process of text mining, we propose a weighted-graph representation of text, called TextGraphs, which captures the grammatical and semantic relations between words and terms in the text. TextGraphs are generated using a new text mining framework which is the main focus of this paper. Our framework, SemScape, uses a statistical parser to generate few of the most probable parse trees for each sentence and employs a novel two-step pattern-based technique to extract from parse trees candidate terms and their grammatical relations. Moreover, SemScape resolves coreferences by a novel technique, generates domain-specific TextGraphs by consulting ontologies, and provides a SPARQL-like query language and an optimized engine for semantically querying and mining TextGraphs. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
11. A Hybrid Multilingual Fuzzy-Based Approach to the Sentiment Analysis Problem Using SentiWordNet.
- Author
-
Madani, Youness, Erritali, Mohammed, Bengourram, Jamaa, and Sailhan, Francoise
- Subjects
- *
SENTIMENT analysis , *NATURAL language processing , *FUZZY logic , *SOCIAL network analysis , *DATA mining , *PRODUCT reviews - Abstract
Sentiment Analysis or in particular social network analysis (SNA) is a new research area which is increased explosively. This domain has become a very active research issue in data mining and natural language processing. Sentiment analysis (opinion mining) consists in analyzing and extracting emotions, opinions or attitudes from product's reviews, movie's reviews, etc., and classify them into classes such as positive, negative and neutral, or extract the degree of importance (polarity). In this paper, we propose a new hybrid approach for classifying tweets into classes based on fuzzy logic and a lexicon based approach using SentiWordnet. Our approach consists in classifying tweets according to three classes: positive, negative or neutral, using SentiWordNet and the fuzzy logic with its three important steps: Fuzzification, Rule Inference/aggregation, and Defuzzification. The dataset of tweets to classify and the result of the classification are stored in the Hadoop Distributed File System (HDFS), and we use the Hadoop MapReduce for the application of our proposal. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
12. AUTOMATIC GENERATION OF CROSSWORD PUZZLES.
- Author
-
RIGUTINI, LEONARDO, DILIGENTI, MICHELANGELO, MAGGINI, MARCO, and GORI, MARCO
- Subjects
AUTOMATION ,CROSSWORD puzzles ,HUMAN-computer interaction ,CONSTRAINT satisfaction ,COMPUTER programming ,DATA mining ,NATURAL language processing - Abstract
Crossword puzzles are used everyday by millions of people for entertainment, but have applications also in educational and rehabilitation contexts. Unfortunately, the generation of ad-hoc puzzles, especially on specific subjects, typically requires a great deal of human expert work. This paper presents the architecture of WebCrow-generation, a system that is able to generate crosswords with no human intervention, including clue generation and crossword compilation. In particular, the proposed system crawls information sources on the Web, extracts definitions from the downloaded pages using state-of-the-art natural language processing techniques and, finally, compiles the crossword schema with the extracted definitions by constraint satisfaction programming. The system has been tested on the creation of Italian crosswords, but the extensive use of machine learning makes the system easily portable to other languages. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
13. Solving Arithmetic Word Problems by Object Oriented Modeling and Query-Based Information Processing.
- Author
-
Mandal, Sourav and Naskar, Sudip Kumar
- Subjects
- *
INFORMATION processing , *INFORMATION modeling , *LEARNING Management System , *DATA mining , *ONLINE education , *NATURAL language processing - Abstract
The paper presents an Object Oriented Analysis and Design (OOAD) approach to modeling, reasoning and a database query based approach to processing and solving addition-subtraction (Add-Sub) type arithmetic Mathematical Word Problems (MWP) of elementary school level. The system identifies and extracts the key entities in a word problem like owners, items and their attributes and quantities, verbs, from all the input sentences, using a rule based Information Extraction (IE) approach based on Semantic Role Labeling (SRL) technique. These information are then stored in predefined templates which are further modeled to represent an MWP in the object-oriented paradigm and processed using query based approach to generate the answer. These kind of applications are based on Natural Language Processing (NLP), Natural Language Understanding (NLU) and Artificial Intelligence (AI), and can be used as intelligent dynamic mathematical tutoring tools as part of E-Learning systems, Learning Management Systems, on-line education, etc. The proposed object oriented mathematical word problem solver can solve arithmetic MWPs involving only addition-subtraction operations and it has produced an accuracy of 94.35% on a subset of the AI2 arithmetic questions dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
14. Advances in the Application of Traditional Chinese Medicine Using Artificial Intelligence: A Review.
- Author
-
Zhang, Sheng, Wang, Wei, Pi, Xitian, He, Zichun, and Liu, Hongying
- Subjects
PHYSICAL diagnosis ,AUSCULTATION ,DATABASES ,CLINICAL decision support systems ,NATURAL language processing ,ARTIFICIAL intelligence ,PATTERN perception receptors ,MEDICAL technology ,INTELLECT ,PALPATION ,CHINESE medicine ,MEDICAL research ,DATA mining - Abstract
Traditional Chinese medicine (TCM), as one of the crystallizations of Chinese wisdom, emphasizes the balance of Yin and Yang to keep the body healthy. Under the theoretical guidance of a holistic view, the diagnostic process in TCM has characteristics of subjectivity, fuzziness, and complexity. Therefore, realizing standardization and achieving objective quantitative analysis are the bottlenecks of the development of TCM. The emergence of artificial intelligence (AI) technology has brought unprecedented challenges and opportunities to traditional medicine, which is expected to provide objective measurements and improve the clinical efficacy. However, the combination of TCM and AI is still in its infancy and currently faces many challenges. Therefore, this review provides a comprehensive discussion of the existing advances, problems, and prospects of the applications of AI technologies in TCM with the hope of promoting a better understanding of the TCM modernization and intellectualization. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. TEXT AND DATA MINING FOR BIOMEDICAL DISCOVERY.
- Author
-
GONZALEZ, GRACIELA, COHEN, KEVIN BRETONNEL, GREENE, CASEY S., KANN, MARICEL G., LEAMAN, ROBERT, SHAH, NIGAM, and JIEPING YE
- Subjects
DATA mining ,TEXT mining ,PROTEOMICS ,INDIVIDUALIZED medicine ,NATURAL language processing - Published
- 2013
16. Semantic Similarity for English and Arabic Texts: A Review.
- Author
-
Alian, Marwah and Awajan, Arafat
- Subjects
PEARSON correlation (Statistics) ,DATA mining ,NATURAL language processing - Abstract
Semantic similarity is the task of measuring relations between sentences or words to determine the degree of similarity or resemblance. Several applications of natural language processing require semantic similarity measurement to achieve good results; these applications include plagiarism detection, text entailment, text summarisation, paraphrasing identification, and information extraction. Many researchers have proposed new methods to measure the semantic similarity of Arabic and English texts. In this research, these methods are reviewed and compared. Results show that the precision of the corpus-based approach exceeds 0.70. The precision of the descriptive feature-based technique is between 0.670 and 0.86, with a Pearson correlation coefficient of over 0.70. Meanwhile, the word embedding technique has a correlation of 0.67, and its accuracy is in the range 0.76–0.80. The best results are achieved by the feature-based approach. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
17. SOCIAL MEDIA MINING SHARED TASK WORKSHOP.
- Author
-
SARKER, ABEED, NIKFARJAM, AZADEH, and GONZALEZ, GRACIELA
- Subjects
SOCIAL media ,DATA mining ,DRUG side effects ,ADULT education workshops ,NATURAL language processing - Published
- 2015
18. MANAGEMENT OF SUBJECTIVE INFORMATION AND FUZZINESS.
- Author
-
BOUCHON-MEUNIER, BERNADETTE
- Subjects
INFORMATION processing ,FUZZY control systems ,SUBJECTIVITY ,DATA mining ,NATURAL language processing ,FUZZY logic - Published
- 2010
19. MUMIS - ADVANCED INFORMATION EXTRACTION FOR MULTIMEDIA INDEXING AND SEARCHING.
- Author
-
DECLERCK, T., CUNNINGHAM, H., SAGGION, H., KUPER, J., REIDSMA, D., and WITTENBURG, P.
- Subjects
NATURAL language processing ,DATA mining ,MULTIMEDIA systems ,INDEXING ,INFORMATION storage & retrieval systems - Published
- 2003
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.