Journal: information sciences / Publisher: elsevier b.v. / Topic: natural language processing - Searchworks@Jio Institute Digital Library Search Results

1. Multi-attention deep neural network fusing character and word embedding for clinical and biomedical concept extraction.

Author: Fan, Shengyu, Yu, Hui, Cai, Xiaoya, Geng, Yanfang, Li, Guangzhen, Xu, Weizhi, Wang, Xia, and Yang, Yaping
Subjects: *ARTIFICIAL neural networks, *CONVOLUTIONAL neural networks, *NATURAL language processing, *LITERARY criticism
Abstract: • Local and global self-attention mechanisms are used for character embedding. • CNN with multi-size filters are used to extract character information for NER. • A cross-attention method that fuses character and word embedding for NER is proposed • A modified Mogrifier LSTM is presented to improve the performance of NER. • Proposed methods integrated with a transformer-based model achieve good performance. Clinical and biomedical concept extraction is critical in medical analysis using clinical and biomedical documents from professional literature, EHRs and PHRs. Named entity recognition (NER) accurately marks essential information in the literature based on the characteristics of the target entity, providing a method for extracting clinical and biomedical concepts. The performance of NER is heavily embedding-dependent, so recent studies have proposed the method of generating word embedding from character-level information, which can strengthen the representation ability for word embedding. In this paper, we present a novel neural network model including an attention mechanism network and a convolutional neural network (CNN) to further improve character-level embedding. First, an attention mechanism is applied simultaneously to the local and global character embedding. Then, a CNN with multi-size filters is used to extract more information from the character level, which can capture more meaningful features from words with various lengths. In addition, a cross-attention method is used to leverage the interaction between word embedding and character embedding to generate the final word representation. Finally, we modified Mogrifier LSTM to make it suitable for NER tasks and integrated it into our model. Experimental results show that our method is effective and that the model performs better than the baseline models. We also apply our methods proposed in this paper to the transformer-based model and obtain a 90.36 F1-score on NCBI-Disease. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

2. A multi-view representation learning framework for commonsense knowledge bases.

Author: Zhang, Weiyan, Chen, Chuang, Chen, Tao, Liu, Jingping, Ye, Qi, and Ruan, Tong
Subjects: *KNOWLEDGE base, *NATURAL language processing, *KNOWLEDGE representation (Information theory)
Abstract: Commonsense knowledge bases play an essential role in a wide range of natural language processing tasks. This paper studies the problem of representation learning for commonsense knowledge bases to effectively incorporate their knowledge into numerical models. Most existing knowledge base representation learning methods are difficult to apply to commonsense knowledge bases since they are much sparser than general knowledge bases. Hence, in this paper, we propose a novel method for commonsense knowledge base representation learning. Specifically, we first model the nodes from multiple views, including word/phase information, context information, and graph information. Then, we design a scoring function to measure whether the commonsense triplets are established through relation representation learning. We conduct extensive experiments on two tasks and the results show that our proposed model outperforms other knowledge base representation learning methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. Validating and constructing behavioral models for simulation and projection using automated knowledge extraction.

Author: Sonnenschein, Tabea S., de Wit, G. Ardine, den Braver, Nicolette R., Vermeulen, Roel C.H., and Scheider, Simon
Subjects: *HUMAN behavior models, *NATURAL language processing, *KNOWLEDGE graphs, *DEEP learning, *SIMULATION methods & models
Abstract: Human behavior may be one of the most challenging phenomena to model and validate. This paper proposes a method for automatically extracting and compiling evidence on human behavior determinants into a knowledge graph. The method (1) extracts associations of behavior determinants and choice options in relation to study groups and moderators from published studies using Natural Language Processing and Deep Learning, (2) synthesizes the extracted evidence into a knowledge graph, and (3) sub-selects the model components and relationships that are relevant and robust. The method can be used to either (4a) construct a structurally valid simulation model before proceeding with calibration or (4b) to validate the structure of existing simulation models. To demonstrate the feasibility of the method, we discuss an example implementation with mode of transport as behavior choice. We find that including non-frequently studied significant behavior determinants drastically improves the model's explanatory power in comparison to only including frequently studied variables. The paper serves as a proof-of-concept which can be reused, extended or adapted for various purposes. • The structure of behavior models should be validated using existing evidence. • Deep Learning can help automatize behavior evidence extraction and synthesis. • Infrequently studied significant variables improve model performance drastically. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Knowledge extraction from textual data and performance evaluation in an unsupervised context.

Author: Chasseray, Yohann, Barthe-Delanoë, Anne-Marie, Négny, Stéphane, and Le Lann, Jean-Marc
Subjects: *NATURAL language processing, *KNOWLEDGE management, *DATA extraction, *EVALUATION methodology
Abstract: Among the incoming challenges in monitoring systems, the aggregation, synthesis and management of knowledge through ontological structures hold an essential place. Existing knowledge extraction systems often use a supervised approach that relies on annotated data, inducing implicitly a fastidious annotation process. Current research is towards the definition of unsupervised or semi-supervised systems, allowing a wider range of knowledge extraction. The evaluation of such systems, performing knowledge extraction using natural language processing methods requires performance indicators. The indicators usually used in such evaluations have limitations in the specific context of knowledge extraction for unsupervised ontology population. Thus, the definition of new evaluation methods becomes a need arising from the singularity of the harvested data, especially when these are not annotated. Hence, this article proposes a method for measuring performance in unsupervised context where reference data and extracted data do not overlap optimally. The proposed evaluation method is based on the exploitation of data that serve as a reference but are not specifically linked to the data used for extraction, which makes it an original evaluation method. To apply the performance measure on concrete cases, this paper also presents an unsupervised self-feeding rule-based approach for domain-independent ontology population from textual data. • Automated validation of relations depends strongly on chosen similarity measure. • ROUGE score is better at comparing ontology instances' meaning than edit distances. • Reuse of prior consensual knowledge can help reduce expert bias in validation step. • Accuracy of automated validation is sensitive to the size of reference data used. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

5. Lifelong language learning with adaptive uncertainty regularization.

Author: Zhang, Lei, Wang, Shupeng, Yuan, Fajie, Geng, Binzong, and Yang, Min
Subjects: *ARTIFICIAL neural networks, *RECOLLECTION (Psychology), *NATURAL language processing, *LANGUAGE models, *LINGUISTIC models, *ONLINE education
Abstract: It has been a long-standing goal in natural language processing (NLP) to learn a general linguistic intelligence model that can perform well on many different NLP tasks continually evolving over time while avoiding revisiting all previous data at each stage. Most existing deep neural networks suffer from catastrophic forgetting when dealing with sequential tasks in an incremental way, leading to dramatic performance degradation due to the missing training data of old tasks. In this paper, we propose a Lifelong language method with Adaptive Uncertainty Regularization (LAUR), which can adapt a single BERT model to work with continuously arriving text examples from different NLP tasks. Specifically, LAUR is built on the Bayesian online learning framework, and three uncertainty regularization terms are devised to collaboratively control the parameters so as to resolve the stability-plasticity dilemma in lifelong language learning. The previous posterior constrain parameters that strongly determine the output results, preventing these parameters from changing drastically, while other parameters are encouraged to be updated over time. In addition, we propose a task-specific residual adaptation module in parallel to each layer of BERT to endow LAUR with the capacity to learn better task-specific knowledge. This configuration makes LAUR less prone to losing the knowledge stored in the base BERT network when learning a new task. Experimental results show that LAUR outperforms state-of-the-art lifelong learning models on a variety of NLP tasks. For reproducibility, we submit the code and data at: https://github.com/kiujhytgtrfd2021/LAUR. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

6. Lifelong language learning with adaptive uncertainty regularization.

Author: Zhang, Lei, Wang, Shupeng, Yuan, Fajie, Geng, Binzong, and Yang, Min
Subjects: *ARTIFICIAL neural networks, *RECOLLECTION (Psychology), *NATURAL language processing, *LANGUAGE models, *LINGUISTIC models, *ONLINE education
Abstract: It has been a long-standing goal in natural language processing (NLP) to learn a general linguistic intelligence model that can perform well on many different NLP tasks continually evolving over time while avoiding revisiting all previous data at each stage. Most existing deep neural networks suffer from catastrophic forgetting when dealing with sequential tasks in an incremental way, leading to dramatic performance degradation due to the missing training data of old tasks. In this paper, we propose a Lifelong language method with Adaptive Uncertainty Regularization (LAUR), which can adapt a single BERT model to work with continuously arriving text examples from different NLP tasks. Specifically, LAUR is built on the Bayesian online learning framework, and three uncertainty regularization terms are devised to collaboratively control the parameters so as to resolve the stability-plasticity dilemma in lifelong language learning. The previous posterior constrain parameters that strongly determine the output results, preventing these parameters from changing drastically, while other parameters are encouraged to be updated over time. In addition, we propose a task-specific residual adaptation module in parallel to each layer of BERT to endow LAUR with the capacity to learn better task-specific knowledge. This configuration makes LAUR less prone to losing the knowledge stored in the base BERT network when learning a new task. Experimental results show that LAUR outperforms state-of-the-art lifelong learning models on a variety of NLP tasks. For reproducibility, we submit the code and data at: https://github.com/kiujhytgtrfd2021/LAUR. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

7. GFCNet: Utilizing graph feature collection networks for coronavirus knowledge graph embeddings.

Author: Xie, Zhiwen, Zhu, Runjie, Liu, Jin, Zhou, Guangyou, Huang, Jimmy Xiangji, and Cui, Xiaohui
Subjects: *KNOWLEDGE graphs, *ARTIFICIAL intelligence, *COVID-19 pandemic, *COVID-19, *MACHINE learning
Abstract: In response to fighting COVID-19 pandemic, researchers in machine learning and artificial intelligence have constructed some medical knowledge graphs (KG) based on existing COVID-19 datasets, however, these KGs contain a considerable amount of semantic relations which are incomplete or missing. In this paper, we focus on the task of knowledge graph embedding (KGE), which serves an important solution to infer the missing relations. In the past, there have been a collection of knowledge graph embedding models with different scoring functions to learn entity and relation embeddings published. However, these models share the same problems of rarely taking important features of KG like attribute features, other than relation triples, into account, while dealing with the heterogeneous, complex and incomplete COVID-19 medical data. To address the above issue, we propose a graph feature collection network (GFCNet) for COVID-19 KGE task, which considers both neighbor and attribute features in KGs. The extensive experiments conducted on the COVID-19 drug KG dataset show promising results and prove the effectiveness and efficiency of our proposed model. In addition, we also explain the future directions of deepening the study on COVID-19 KGE task. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

8. Structured encryption for knowledge graphs.

Author: Xue, Yujie, Chen, Lanxiang, Mu, Yi, Zeng, Lingfang, Rezaeibagha, Fatemeh, and Deng, Robert H.
Subjects: *KNOWLEDGE graphs, *KEYWORD searching, *NATURAL language processing, *KNOWLEDGE base
Abstract: We investigate the problem of structured encryption (STE) for knowledge graphs (KGs) where the knowledge of data can be efficiently and privately queried. Presently, the application of natural language processing (NLP) for knowledge-based search is gradually emerging. Compared with the traditional search based only on keywords of documents—symmetric searchable encryption (SSE), the knowledge-based search system transforms the latent knowledge contained in documents into a semantic network as a knowledge base, which greatly improves the accuracy and relevance of search results. In order to develop a knowledge-based search, the contents of documents are analyzed and extracted using KG techniques (e.g. multi-relational graph (MG) and property graph (PG)), and then all encrypted nodes and edges in a KG constitute the entire index table and database. This paper proposes the first STE for KGs with CQA2-security to search on protected knowledge, where KGs include MGs and PGs. In general, the latter is more complex than the former, but it can represent more abundant knowledge. Experimental results show that the index construction time of our schemes is about 1.9s and the query time is about 190 ms. Our sensitivity analysis shows that the performance of our proposed schemes is greatly influenced by the number of edges and nodes, but less by the number of properties. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

9. Pay attention to the hidden semanteme.

Author: Tang, Huanling, Liu, Xiaoyan, Wang, Yulin, Dou, Quansheng, and Lu, Mingyu
Subjects: *INFERENCE (Logic), *ATTENTIONAL bias, *NATURAL language processing, *LINGUISTIC models, *SENTIMENT analysis, *DEEP learning
Abstract: With the capability of modeling lighter, MLP-based models like the pNLP-Mixer and the HyperMixer demonstrate the potential for diverse tasks in NLP. However, these linguistic models are not optimized for the regularity of textual hierarchical abstraction. Here, this paper proposes the hidden bias attention (HBA), a novel attention mechanism that is lighter than self-attention and focuses on extracting hidden (topic) semanteme. Additionally, this paper introduces a series of lightweight deep learning architectures, HBA-Mixer based on HBA and MHBA-Mixers based on multi-head HBA, which both outperforms pNLP-Mixer and HyperMixer in accuracy with fewer parameters on 3 tasks, including text classification, natural language inference, and sentiment analysis. Compared with large pre-trained models, MHBA-Mixers achieve over 90% of their accuracy with one-thousandth of the parameters. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

10. BINER: A low-cost biomedical named entity recognition.

Author: Asghari, Mohsen, Sierra-Sosa, Daniel, and Elmaghraby, Adel S.
Subjects: *MEDICAL personnel, *ELECTRONIC health records, *HEALTH care industry, *PATIENTS' attitudes, *NATURAL language processing
Abstract: A primary focus of the healthcare industry is to improve patient experience and quality of service. Practitioners and health workers are generating large volumes of text that are captured in Electronic Medical Records, clinical reports, and publications. Additionally, patients post millions of comments on social media related to healthcare, on diverse topics such as hospital services, disease symptoms, and drugs effects. Unifying various data sources can guide physicians and healthcare workers to avoid unnecessary, irrelevant information and expedite access to helpful information. The main challenge to creating Biomedical Natural Language Understanding is the lack of standard datasets and the extensive computational resources needed to develop different models. This paper proposes a model trained on low-tier GPU computers, producing comparable results to larger models like BioBERT. We propose BINER, a Biomedical Named Entity Recognition architecture using limited data and computational resources. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

11. AlBERTino for stock price prediction: a Gibbs sampling approach.

Author: Colasanto, Francesco, Grilli, Luca, Santoro, Domenico, and Villani, Giovanni
Subjects: *GIBBS sampling, *STOCK prices, *MONTE Carlo method, *NATURAL language processing, *SENTIMENT analysis
Abstract: • An Italian BERT model (AlBERTo) has been fine-tuned on financial sentences. • AlBERTino can determine the sentiment score of news present in financial newspapers. • The sentiment score is used to drive the parameters of GBM through a MCMC. • The average of Monte Carlo simulation paths is the predicted stock value. BERT (Bidirectional Encoder Representations from Transformers) is one of the most popular models in Natural Language Processing (NLP) for Sentiment Analysis. The main goal is to classify sentences (or entire texts) and to obtain a score in relation to their polarity: positive, negative or neutral. Recently, a Transformer-based architecture, the fine-tuned AlBERTo (Polignano et al. (2019)), has been introduced to determine a sentiment score in the financial sector through a specialized corpus of sentences. In this paper, we use the sentiment (polarity) score to improve the stocks forecasting. We apply the BERT model to determine the score associated to various events (both positive and negative) that have affected some stocks in the market. The sentences used to determine the scores are newspaper articles published on MilanoFinanza. We compute both the average sentiment score and the polarity, and we use a Monte Carlo method to generate (starting from the day the article was released) a series of possible paths for the next trading days, exploiting the Bayesian inference to determine a new series of bounded drift and volatility values on the basis of the score; thus, returning an exact "directed" price as a result. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

12. Novel target attention convolutional neural network for relation classification.

Author: Geng, Zhiqiang, Li, Jun, Han, Yongming, and Zhang, Yanhui
Subjects: *CONVOLUTIONAL neural networks, *NATURAL language processing, *MATRIX effect
Abstract: • A novel target attention convolutional neural network is proposed. • The proposed method fully utilized the associated information between entities and relations. • The proposed method does not need to introduce external complicated features. • The proposed method achieves state-of-the-art performance on SemEval-2010 task 8 and Conll04 datasets. Relation classification (RC) is an essential task in natural language processing (NLP), which extracts relationships of entity pairs in sentences of text. In the paper, a novel target attention convolutional neural network (TACNN) is proposed for the RC by fully utilizing word embedding information and position embedding information. Simultaneously, a target attention mechanism (TAM) is applied into a context layer of the convolutional neural network (CNN) model, which increases the effect of the relationship matrix weights of two entities in the sentence, while ignoring the calculation of irrelevant terms. And the TACNN is essentially to modify the weight of the relationship matrix of entities in the sentence at the context layer and connect the relationship feature composed of the lexical layer feature with the target attention layer feature. Therefore, the TACNN simplifies the structure of the CNN and improves the computational efficiency. On SemEval-2010 Task 8 dataset and Conll04 dataset, the TACNN obtains 85.3% and 71.4% of the F1-score, respectively. In contrast to previously available public models, the TACNN achieves a state-of-the-art level in the F1-score of the RC. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

13. AlBERTino for stock price prediction: a Gibbs sampling approach.

Author: Colasanto, Francesco, Grilli, Luca, Santoro, Domenico, and Villani, Giovanni
Subjects: *GIBBS sampling, *STOCK prices, *MONTE Carlo method, *NATURAL language processing, *SENTIMENT analysis
Abstract: • An Italian BERT model (AlBERTo) has been fine-tuned on financial sentences. • AlBERTino can determine the sentiment score of news present in financial newspapers. • The sentiment score is used to drive the parameters of GBM through a MCMC. • The average of Monte Carlo simulation paths is the predicted stock value. BERT (Bidirectional Encoder Representations from Transformers) is one of the most popular models in Natural Language Processing (NLP) for Sentiment Analysis. The main goal is to classify sentences (or entire texts) and to obtain a score in relation to their polarity: positive, negative or neutral. Recently, a Transformer-based architecture, the fine-tuned AlBERTo (Polignano et al. (2019)), has been introduced to determine a sentiment score in the financial sector through a specialized corpus of sentences. In this paper, we use the sentiment (polarity) score to improve the stocks forecasting. We apply the BERT model to determine the score associated to various events (both positive and negative) that have affected some stocks in the market. The sentences used to determine the scores are newspaper articles published on MilanoFinanza. We compute both the average sentiment score and the polarity, and we use a Monte Carlo method to generate (starting from the day the article was released) a series of possible paths for the next trading days, exploiting the Bayesian inference to determine a new series of bounded drift and volatility values on the basis of the score; thus, returning an exact "directed" price as a result. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

14. Novel target attention convolutional neural network for relation classification.

Author: Geng, Zhiqiang, Li, Jun, Han, Yongming, and Zhang, Yanhui
Subjects: *CONVOLUTIONAL neural networks, *NATURAL language processing, *MATRIX effect
Abstract: • A novel target attention convolutional neural network is proposed. • The proposed method fully utilized the associated information between entities and relations. • The proposed method does not need to introduce external complicated features. • The proposed method achieves state-of-the-art performance on SemEval-2010 task 8 and Conll04 datasets. Relation classification (RC) is an essential task in natural language processing (NLP), which extracts relationships of entity pairs in sentences of text. In the paper, a novel target attention convolutional neural network (TACNN) is proposed for the RC by fully utilizing word embedding information and position embedding information. Simultaneously, a target attention mechanism (TAM) is applied into a context layer of the convolutional neural network (CNN) model, which increases the effect of the relationship matrix weights of two entities in the sentence, while ignoring the calculation of irrelevant terms. And the TACNN is essentially to modify the weight of the relationship matrix of entities in the sentence at the context layer and connect the relationship feature composed of the lexical layer feature with the target attention layer feature. Therefore, the TACNN simplifies the structure of the CNN and improves the computational efficiency. On SemEval-2010 Task 8 dataset and Conll04 dataset, the TACNN obtains 85.3% and 71.4% of the F1-score, respectively. In contrast to previously available public models, the TACNN achieves a state-of-the-art level in the F1-score of the RC. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

15. DRGAT: Dual-relational graph attention networks for aspect-based sentiment classification.

Author: You, Lan, Peng, Jiaheng, Jin, Hong, Claramunt, Christophe, Zeng, Haoqiu, and Zhang, Zhen
Subjects: *NATURAL language processing, *GRAPH neural networks, *ROOSTING, *CLASSIFICATION, *HAWTHORNS
Abstract: Aspect-based sentiment classification has become a popular topic in natural language processing. Exploiting dependency syntactic information with graph neural networks has recently become a popular trend. Despite their success, methods that rely heavily on a dependency tree face major challenges. This concerns the alignment of aspects and their word sentiments due to the richness of the language and the fact that a dependency tree might produce noisy signals from unrelated associations. This paper introduces a Dual-Relational Graph Attention Network (DRGAT) that fully exploits syntactic structural information and then the sentiment-aware context (e.g., phrase segmentation and hierarchical structure) of the constituent tree of a sentence. Additional constituency and dependency attention mechanisms provide comprehensive syntactic information across words, thereby enabling an accurate connection between aspect words and corresponding sentiment words. Considering that the original parsed constituency tree may have a large depth, this could lead to words being far apart increasing the computational overhead. The constituency tree of each sentence is dynamically reconstructed by determining the importance of each relational node. Extensive experimental results on six English datasets demonstrated that fully exploiting syntactic information can achieve excellent sentiment classification results. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. A data-centric review of deep transfer learning with applications to text data.

Author: Bashath, Samar, Perera, Nadeesha, Tripathi, Shailesh, Manjang, Kalifa, Dehmer, Matthias, and Streib, Frank Emmert
Subjects: *DEEP learning, *TRANSFER of training, *SUPERVISED learning, *SYSTEMATIZED Nomenclature of Medicine, *NATURAL language processing, *TAXONOMY
Abstract: In recent years, many applications are using various forms of deep learning models. Such methods are usually based on traditional learning paradigms requiring the consistency of properties among the feature spaces of the training and test data and also the availability of large amounts of training data, e.g., for performing supervised learning tasks. However, many real-world data do not adhere to such assumptions. In such situations transfer learning can provide feasible solutions, e.g., by simultaneously learning from data-rich source data and data-sparse target data to transfer information for learning a target task. In this paper, we survey deep transfer learning models with a focus on applications to text data. First, we review the terminology used in the literature and introduce a new nomenclature allowing the unequivocal description of a transfer learning model. Second, we introduce a visual taxonomy of deep learning approaches that provides a systematic structure to the many diverse models introduced until now. Furthermore, we provide comprehensive information about text data that have been used for studying such models because only by the application of methods to data, performance measures can be estimated and models assessed. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

17. Joint extraction of entities and relations via an entity correlated attention neural model.

Author: Li, Ren, Li, Dong, Yang, Jianxi, Xiang, Fangyue, Ren, Hao, Jiang, Shixin, and Zhang, Luyi
Subjects: *NATURAL language processing, *NAMED-entity recognition, *MACHINE learning
Abstract: Named entity recognition and relation extraction are crucial tasks in natural language processing. As the traditional pipelined manners may suffer from the error propagation issue and ignore underlying interactions, joint extraction of entities and relations has become the dominant trend. However, the performance of existing joint extraction models needs improvement. This paper presents a two-stage tagging scheme that separately labels candidate head entities and multiple tail entities in specific relations. Next, it proposes a novel lightweight joint extraction neural model based on the entity-first labeling strategy. In the proposed model, the BiLSTM-based encoder combines the hidden state and global context features and feeds them as input for the next two entity labeling tasks. Further, with the input of the mixed context representation, the candidate-head-entity recognition module is adopted to identify the candidate head entity, while the multiple-tail-entities recognition module is equipped with an entity-correlated attention mechanism to identify the corresponding tail entity under a specific head entity. Comprehensive experiments on two widely used English datasets and one self-constructed Chinese dataset were performed. The experimental results showed that the proposed model outperformed the baseline approaches in the relation extraction task and achieved a competitive entity recognition effect via a lightweight architecture. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

18. Learning decomposed hierarchical feature for better transferability of deep models.

Author: Yang, Jianfei, Qian, Hanjie, Zou, Han, and Xie, Lihua
Subjects: *DEEP learning, *NATURAL language processing, *COMPUTER vision, *PATTERN recognition systems
Abstract: Deep models have achieved prominent results in pattern recognition tasks, especially computer vision and natural language processing. However, the dataset bias caused by the distribution discrepancy between the training and testing data hinders the generalization ability of deep models. Though many domain adaptation approaches have been proposed to mitigate such negative effect, most of them improve the transferability of features by aligning global distributions of deep models. Few researchers pay attention to the versatility of deep features which can play a vital role in cross-domain recognition. In this paper, we propose to enrich the classic deep learning models by capturing high-low-frequency information and multi-scale features, which deal with the domain shift that cannot be easily addressed by merely feature-level alignment. The Hierarchical Transfer Network (HTN) leverages octave convolution, pyramid features, and self-attention mechanism for revamping the classic models, which can be further integrated with any domain alignment approaches by replacing the feature extractor with the proposed HTN. Extensive experiments have been conducted on three public domain adaptation benchmarks. The results show that the proposed HTN can effectively improve adversarial-based, statistics-based, and norm-based domain adaptation approaches, achieving competitive performance without involving model complexity. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

19. Deep analysis of word sense disambiguation via semi-supervised learning and neural word representations.

Author: Duarte, José Marcio, Sousa, Samuel, Milios, Evangelos, and Berton, Lilian
Subjects: *CLASSICAL test theory, *UNDIRECTED graphs, *CLASSIFICATION algorithms, *NATURAL language processing, *SEMANTICS
Abstract: • Ambiguity is a challenging task in text mining addressed for word-sense disambiguation algorithms. • The lack of labeled dataset is a barrier and semi-supervised learning (SSL) addresses this problem. • Our method explored SSL and word-embeddings CBOW, SKIP-GRAM, FASTTEXT, GLOVE, BERT and ELECTRA. • The F1-score increases in several datasets like Senseval-2, Senseval-3, Semeval-2007 and Semcor. Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context. Different approaches have been proposed in supervised and unsupervised domains. In most cases, supervised learning provides superior WSD performance. Since sense-annotated corpora can be difficult or time-consuming to obtain, which must be repeated for new domains, languages, and sense inventories, semi-supervised learning (SSL) methods, that combine a small amount of sense-annotated data, start to be pre-eminent. In SSL, graph-based methods are common, because they capture the relationships between terms using an undirected graph. This paper aims to investigate semi-supervised WSD by considering different graph-based SSL algorithms with features generated by word embeddings from Word2Vec, FastText, GloVe, BERT and ELECTRA models combined with parts-of-speech tags and word context. We test several combinations of word-embedding models, similarity measures for graph construction and SSL classification algorithms to disambiguate classical lexical sample WSD datasets. The results indicate our SSL algorithms achieved competitive results compared to supervised ones and the ELECTRA models performed better than other embeddings for SSL. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

20. Document-level relation extraction with Entity-Selection Attention.

Author: Yuan, Changsen, Huang, Heyan, Feng, Chong, Shi, Ge, and Wei, Xiaochi
Subjects: *NATURAL language processing, *BASE pairs, *SEMANTICS
Abstract: Document-level relation extraction is a complex natural language processing task that predicts relations of entity pairs by capturing the critical semantic features on entity pairs from the document. However, current methods usually consider that the entity pairs contain the vast majority of information which can represent relational facts, and thus focus on modeling the entity pair, ignoring features on whole document and sentences. In the document-level relation extraction, the distance between entity pairs is relatively long. Judging the relation between entities usually requires reading many sentences or the whole document. Therefore, sentences and documents are particularly crucial for document-level relation extraction. In order to make full use of the multi-level information of sentences and documents, this paper proposes a document-level relation extraction framework with two advantages. First, we use the encoder to obtain the semantic features about the document and use the inter-sentence attention based on entity pairs to dynamically capture the features of multiple vital sentences. Second, we design a document gating that combines sentence-level features with document-level features to predict relations. Extensive experiments on a benchmark dataset have well-validated the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

21. Document-level event causality identification via graph inference mechanism.

Author: Zhao, Kun, Ji, Donghong, He, Fazhi, Liu, Yijiang, and Ren, Yafeng
Subjects: *NATURAL language processing, *KNOWLEDGE transfer, *PROSPECTIVE memory
Abstract: Event causality identification is an important research task in natural language processing. Existing methods largely focus on identifying explicit causal relations, and give poor performance in implicit causalities, especially in the document level. In this paper, we formalize event causality identification as a graph-based edge prediction problem and propose a novel document-level context-based graph inference mechanism. Specifically, we use attention-based neural networks to automatically extract document-level contextual information, and a direction-sensitive graph inference mechanism to achieve information transfer and interaction among event causalities. Experimental results on the EventStoryLine v1.5 dataset show that our approach outperforms previous methods and baseline systems by a large margin in F1-score metrics (2.45% improvement on intra-sentence causalities and 3.08% improvement on cross-sentence causalities). Further analysis demonstrates that our model can effectively capture the document-level contextual information and latent causal information among events. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

22. Domain-specific meta-embedding with latent semantic structures.

Author: Liu, Qian, Lu, Jie, Zhang, Guangquan, Shen, Tao, Zhang, Zhihan, and Huang, Heyan
Subjects: *NATURAL language processing
Abstract: Meta-embedding aims at assembling pre-trained embeddings from various sources and producing more expressively powerful word representations. Many natural language processing (NLP) tasks in a specific domain benefit from meta-embedding, especially when the task suffers from low resources. This paper proposes an unsupervised meta-embedding method that jointly models background knowledge from the source embeddings and domain-specific knowledge from the task domain. Specifically, embeddings from multiple sources for a word are dynamically aggregated to a single meta-embedding by a differentiable attention module. The embeddings derived from pre-training on a large-scale corpus provide complete background knowledge of word usage. Then, the meta-embedding is further enriched by exploring domain-specific knowledge from each task domain in two ways. First, contextual information in the raw corpus is considered to capture the semantics of words. Second, a graph representing domain-specific semantic structures is extracted from the raw corpus to highlight the relationships between salient words, then the graph is modeled by a powerful graph convolution network to effectively capture rich semantic structures among words in the task domain. Experiments conducted on two tasks, i.e., text classification and relation extraction, show that our model outputs more accurate word meta-embeddings for the task domain, compared to other state-of-the-art competitors.. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

23. Automatic annotation of protected attributes to support fairness optimization.

Author: Consuegra-Ayala, Juan Pablo, Gutiérrez, Yoan, Almeida-Cruz, Yudivian, and Palomar, Manuel
Subjects: *MACHINE learning, *FILM reviewing, *FAIRNESS, *NATURAL language processing, *ANNOTATIONS
Abstract: Recent research has shown that the unaware automation of high-risk decision-making tasks can result in unfair decisions being made. The most common approaches to address this problem adopt definitions of fairness based on protected attributes. Precise annotation of protected attributes enables the application of bias mitigation techniques to commonly unlabeled kinds of data (e.g., images, text, etc.). This paper proposes a framework to automatically annotate protected attributes in data collections. The framework focuses on providing a single interface to annotate protected attributes of different types (e.g., gender, race, etc.) and from different kinds of data. Internally, the framework coordinates multiple sensors to produce the final annotation. Several sensors for textual data are proposed. An optimization search technique is designed to tune the framework to specific domains. Additionally, a small dataset of movie reviews —annotated with gender and sentiment— was created. The evaluation in datasets of texts from diverse domains shows the quality of the annotations and their effectiveness to be used as a proxy to estimate fairness in datasets and machine learning models. The source code is available online for the research community. • A framework to automatically annotate protected attributes in datasets. • Techniques to annotate gender in textual collections with fairness considerations. • Optimization search approach for tuning the framework to custom domains. • Small dataset of movie reviews annotated with gender and sentiment. • Effective use of annotations as a proxy to estimate fairness in datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. NLPSweep: A comprehensive defense scheme for mitigating NLP backdoor attacks.

Author: Xiang, Tao, Ouyang, Fei, Zhang, Di, Xie, Chunlong, and Wang, Hao
Subjects: *DENIAL of service attacks, *NATURAL language processing
Abstract: Natural language processing (NLP) backdoor attacks have become a hidden threat to modern NLP applications. Most of the existing defense methods defend against specific types of backdoor attacks, and they generally fail to defend against invisible backdoors with syntactically correct triggers. This paper proposes NLPSweep, a comprehensive defense scheme that can defend against five common types of backdoor attacks, namely, character, word, sentence, homograph, and learnable textual attacks. Specifically, we propose a framework that can discover an effective defense solution without prior knowledge of the attacks. The defense solution is optimized from the framework and can defend against various attacks while ensuring high accuracy. Finally, we verify the effectiveness of NLPSweep on two pretrained models (BERT and XLNET) on three classic datasets (SST-2, IMDB, and OLID) and compare it with five state-of-the-art defense methods, namely, ONION, Pred, RAP, Fine-pruning, and STRIP. The experimental results demonstrate that NLPSweep has an average model accuracy (ACC) greater than 0.922 and that the average attack success rate (ASR) is only 0.202, outperforming the compared methods. Furthermore, NLPSweep is tested on the real-world Yelp dataset and it can effectively defend against backdoor attacks with the ASR less than 0.07 and the ACC greater than 0.973.1 • Existing defense methods only defend against specific backdoor attacks. • Our defense framework can discover an effective defense solution without prior knowledge of the attacks. • Our defense solution can defend against various attacks while ensuring high accuracy. • Tests on three classic datasets show that our defense scheme outperforms recent work. • Our scheme demonstrates operational success on the real-world Yelp dataset, effectively defending against backdoor attacks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. General-purpose hierarchical optimisation of machine learning pipelines with grammatical evolution.

Author: Estevez-Velarde, Suilan, Gutiérrez, Yoan, Almeida-Cruz, Yudivián, and Montoyo, Andrés
Subjects: *MACHINE learning, *PIPELINES, *SOURCE code, *NATURAL language processing, *SIMPLE machines, *SCHOOL-to-prison pipeline
Abstract: • HML-Opt allows a researcher to define a complex space of machine learning pipelines. • HML-Opt automatically finds the best pipelines within time and memory constraints. • Meaningful statistics and insights are extracted from the experimentation process. • Freely available source code is provided for the research community. This paper introduces Hierarchical Machine Learning Optimisation (HML-Opt), an AutoML framework that is based on probabilistic grammatical evolution. HML-Opt has been designed to provide a flexible framework where a researcher can define the space of possible pipelines to solve a specific machine learning problem, which can range from high-level decisions about representation and features to low-level hyper-parameter values. The evaluation of HML-Opt is presented via two different case studies, both of which demonstrate that it is competitive with existing AutoML tools on a variety of benchmarks. Furthermore, HML-Opt can be applied to novel problems, such as knowledge extraction from natural language text, whereas other techniques are insufficiently flexible to capture the complexity of these scenarios. The source code for HML-Opt is available online for the research community. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

26. Paraphrase thought: Sentence embedding module imitating human language recognition.

Author: Jang, Myeongjun and Kang, Pilsung
Subjects: *NATURAL language processing, *SENTIMENT analysis, *PARAPHRASE, *RECURRENT neural networks
Abstract: Sentence embedding is an important research topic in natural language processing. It is essential to generate a good embedding vector that fully reflects the semantic meaning of a sentence in order to achieve an enhanced performance for various natural language processing tasks, such as machine translation and document classification. Thus far, various sentence embedding models have been proposed, and their feasibility has been demonstrated through good performances on tasks following embedding, such as sentiment analysis and sentence classification. However, because the performances of sentence classification and sentiment analysis can be enhanced by using a simple sentence representation method, it is not sufficient to claim that these models fully reflect the meanings of sentences based on good performances for such tasks. In this paper, inspired by human language recognition, we propose the following concept of semantic coherence, which should be satisfied for a good sentence embedding method: similar sentences should be located close to each other in the embedding space. Then, we propose the Paraphrase-Thought (P-thought) model to pursue semantic coherence as much as possible. Experimental results on three paraphrase identification datasets (MS COCO, STS benchmark, SICK) show that the P-thought models outperform the benchmarked sentence embedding methods. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

27. Cross-domain ontology construction and alignment from online customer product reviews.

Author: Geng, Qian, Deng, Siyu, Jia, Danping, and Jin, Jian
Subjects: *CONSUMERS' reviews, *PRODUCT reviews, *ONTOLOGIES (Information retrieval), *DIGITAL cameras, *NATURAL language processing
Abstract: Online reviews often contain detailed sentiment towards different aspects of products and these opinions help consumers to be familiar with products. The introduction of domain ontology from online reviews may help consumers to obtain relevant information about products quickly. Nonetheless, they may compare products in multiple domains for purchase decisions. On this basis, the comparison of products in different domains induces that ontology alignment becomes a fundamental task to form a cross-domain ontology. However, due to large-scale text data and complex alignment mapping relations, many alignment algorithms are far from performing effectively. In this paper, a series of natural language processing approaches are applied to construct domain ontologies from online product reviews. Next, a new ontology alignment method is proposed to make purchase decisions regarding cross-domain product comparisons, in which a semantic-based algorithm and a structure-based algorithm are integrated to form a cross-domain ontology. Categories of experiments were conducted on reviews of smartphone and digital camera. Compared with benchmarked alignment tools, it shows that the proposed method yields to more accurate results. Finally, a case study with a customer friendly website is illustrated to present how the alignment of cross-domain ontology is able to help consumers on purchase decision support. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

28. Knowledge graphs completion via probabilistic reasoning.

Author: Zhang, Richong, Mao, Yongyi, and Zhao, Weihua
Subjects: *NATURAL language processing, *MARGINAL distributions, *KNOWLEDGE base, *REASONING
Abstract: Constructing large-scale knowledge base has encountered a bottleneck because of the limitation of natural language processing. Many approaches have been put forward to infer new facts based on existing knowledge. Graph feature models mine rule-like patterns from a knowledge base and use them to predict missing edges. These models take account of the graph structure information and they can explain the existence of a fact reasonably. Existing models only describe local interaction between entities, but how to model co-relationships among facts globally is a tough problem. In this paper, we develop an efficient model which uses association rules to make inferences. First, we use a rule mining model to detect simple association rules and use them to produce large amounts of evidence. Second, based on all the produced evidence and the connections among them, we construct a factor graph which represents the inference space. Then, we develop an EM inference model, wherein the E-step we use Belief Propagation to calculate the marginal distribution of candidate edges and, in the M-step we propose a Generalized Iterative Proportional Fitting algorithm to re-learn the confidence of soft rules. Experiments show that our approach outperforms state-of-the-art approaches in knowledge base completion (KBC) tasks. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

29. Plausibility-promoting generative adversarial network for abstractive text summarization with multi-task constraint.

Author: Yang, Min, Wang, Xintong, Lu, Yao, Lv, Jianming, Shen, Ying, and Li, Chengming
Subjects: *PERFORMANCE art, *NATURAL language processing
Abstract: Abstractive text summarization is an essential task in natural language processing, which aims to generate concise and condensed summaries retaining the salient information of the input document. Despite the progress of previous work, generating summaries, which are informative, grammatically correct and diverse, remains challenging in practice. In this paper, we present a Plausibility-promoting Generative Adversarial Network for Abstractive Text Summarization with Multi-Task constraint (PGAN-ATSMT), which shows promising performance for generating informative, grammatically correct, and novel summaries. First, PGAN-ATSMT adopts a plausibility-promoting generative adversarial network, which jointly trains a discriminative model D and a generative model G via adversarial learning. The generative model G employs the sequence-to-sequence architecture as its backbone, taking as input the original text and generating a corresponding summary. A novel language model based discriminator D is proposed to distinguish the generated summaries by G from the ground truth summaries without the saturation issue in the previous binary classifier discriminator. The generative model G and the discriminative model D are learned with a minimax two-player game, thus this adversarial process can eventually adjust G to produce high-quality and plausible summaries. Second, we propose two extended regularizations for the generative model G using the multi-task learning, sharing its LSTM encoder and LSTM decoder with text categorization task and syntax annotation task, respectively. The auxiliary tasks help to improve the quality of locating salient information of a document and generate high-quality summaries from language modeling perspective alleviating the issues of incomplete sentences and duplicated words. Experimental results on two benchmark datasets illustrate that PGAN-ATSMT achieves better performance than the state-of-the-art baseline methods in terms of both quantitative and qualitative evaluations. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

30. Leveraging multiple features for document sentiment classification.

Author: Kong, Li, Li, Chuanyi, Ge, Jidong, Zhang, FeiFei, Feng, Yi, Li, Zhongjin, and Luo, Bin
Subjects: *MICROBLOGS, *NATURAL language processing, *PERFORMANCE art, *CLASSIFICATION
Abstract: • Sentiment environment, POS tag, emoticon information and character information are combined with a CNN or RNN model to extract sentence-level features. • A hierarchical structure and a man-made rule are leveraged together to combine representation for each sentence into document-level representation. • Experimental results demonstrate that our method can improve the document sentiment classification performance. • Ablation experiments prove the usefulness of the features. Sentiment classification is an important research task in Natural Language Processing. To fulfill this type of classification, previous works have focused on leveraging task-specific features. However, they only notice part of the related features. Also, state-of-the-art methods based on neural networks often ignore traditional features. This paper proposes a novel text sentiment classification method that learns the representation of texts by hierarchically incorporating multiple features. More specifically, we design different representations for sentiment words according to the polarity of labeled texts and whether negation exists; we distinguish words with different part-of-speech tags; emoticons, if there are, are to optimize the word vectors obtained in the previous step; apart from word embeddings, character embeddings are also trained. We use a deep neural network to get a sentence-level representation from both word and character sequence. For documents with at least two sentences, we use a hierarchical structure and design a rule to give more weight to import sentences empirically to get a document-level representation. Experimental results on open datasets demonstrate that our method could effectively improve the sentiment classification performance compared with the basic models and state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

31. Phrase2Vec: Phrase embedding based on parsing.

Author: Wu, Yongliang, Zhao, Shuliang, and Li, Wenbin
Subjects: *NATURAL language processing, *TERMS & phrases, *WORD order (Grammar), *EMBEDDINGS (Mathematics)
Abstract: • BOP (bag of phrases) is a better text representation than BOW (bag of words). • Phrase2Vec can be solved through two stages: phrase mining and phrase embedding. • Hierarchical phrases in sentences can be mined by traversing the parse tree. • Phrase2Vec can effectively solve the problem of phrase embedding. • The BOP can improve the performance of downstream natural language processing tasks, e.g. text categorization, text clustering. Text is one of the most common unstructured data, and usually, the most primary task in text mining is to transfer the text into a structured representation. However, the existing text representation models split the complete semantic unit and neglect the order of words, finally lead to understanding bias. In this paper, we propose a novel phrase-based text representation method that takes into account the integrity of semantic units and utilizes vectors to represent the similarity relationship between texts. First, we propose HPMBP (Hierarchical Phrase Mining Based on Parsing) which mines hierarchical phrases by parsing and uses BOP (Bag Of Phrases) to represent text. Then, we put forward three phrase embedding models, called Phrase2Vec, including Skip-Phrase, CBOP (Continuous Bag Of Phrases), and GloVeFP (Global Vectors For Phrase Representation). They learn the phrase vector with semantic similarity, further obtain the vector representation of the text. Based on Phrase2Vec, we propose PETC (Phrase Embedding based Text Classification) and PETCLU (Phrase Embedding based Text Clustering). PETC utilizes the phrase embedding to get the text vector, which is fed to a neural network for text classification. PETCLU gets the vectorization expression of text and cluster center by Phrase2Vec, furthermore extends the K-means model for text clustering. To the best of our knowledge, it is the first work that focuses on the phrase-based English text representation. Experiments show that the introduced Phrase2Vec outperforms state-of-the-art phrase embedding models in the similarity task and the analogical reasoning task on Enwiki, DBLP, and Yelp dataset. PETC is superior to the baseline text classification methods in the F1-value index by about 4%. PETCLU is also ahead of the prevalent text clustering methods in entropy and purity indicators. In summary, Phrase2Vec is a promising approach to text mining. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

32. Semantic relation extraction using sequential and tree-structured LSTM with attention.

Author: Geng, ZhiQiang, Chen, GuoFei, Han, YongMing, Lu, Gang, and Li, Fang
Subjects: *NATURAL language processing, *FEATURE extraction, *SHORT-term memory, *INFORMATION retrieval, *FRAMES (Linguistics)
Abstract: • The sequential and tree-structured LSTM with attention is proposed. • Word-based features can enhance the relation extraction performance. • The proposed method is used for the relation extraction. • The relation extraction performance is demonstrated on public datasets. Semantic relation extraction is crucial to automatically constructing a knowledge graph (KG), and it supports a variety of downstream natural language processing (NLP) tasks such as query answering (QA), semantic search and textual entailment. In addition, the semantic relation extraction task is mainly responsible for identifying entity pairs from raw texts and extracting the semantic relations between the extracted entity pairs. Existing methods consider only lexical-level features and often ignore syntactic features, resulting in poor relation extraction performance. By analyzing the necessity of the syntactic dependency and the contributions of words in a sentence to relation extraction, this paper proposes an end-to-end method that uses bidirectional tree-structured long short-term memory (LSTM) to extract structural features based on the dependency tree of a sentence. To enhance the performance of the relation extraction, the bidirectional sequential LSTM with attention is used to identify word-based features including the positional information of entity pairs and the contribution of words. Then, structural features and word-based features are concatenated to optimize the relation extraction performance. Finally, the proposed method is used on the SemEval 2010 task 8 and the CoNLL04 datasets to validate its performance. The experimental results show that the proposed method achieves state-of-the-art results on the SemEval 2010 task 8 and the CoNLL04 datasets. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

33. Event causality identification via graph contrast-based knowledge augmented networks.

Author: Ding, Ling, Chen, Jianting, Du, Peng, and Xiang, Yang
Subjects: *KNOWLEDGE graphs, *NATURAL language processing, *CAUSALITY (Physics)
Abstract: Identifying causality between events is a crucial research task in natural language processing. However, existing methods either ignore background knowledge of events or do not consider interference of knowledge graph noise on event representations. In this paper, we propose a novel Graph Contrast-based Knowledge Augmented Network (GCKAN) for event causality identification task, which integrates comprehensive background knowledge of events from knowledge graphs and alleviates the knowledge graph noise problem. First, a descriptive knowledge augmentation module is proposed to aggregate one-hop neighbor information in knowledge graphs and learn meaningful descriptive knowledge of events. Then, a relational knowledge augmentation module is designed to encode multi-hop path information and learn latent reasoning knowledge between event pairs. In addition, trustworthiness-based and degree-based graph contrastive learning schemas are devised in the two modules respectively, which suppress knowledge graph noise during information aggregation and derive more robust knowledge-aware event representations. Extensive experiments on three public datasets demonstrate the consistent superiority of GCKAN over state-of-the-art knowledge-based techniques. Noise interference experiments and cross-topic adaptation experiments further verify the robustness and generalization of GCKAN. • A graph contrast-based knowledge augmented network for event causality identification is proposed. • The model aggregates descriptive and relational knowledge from knowledge graphs and alleviates labeled data scarcity issue. • Graph contrastive learning schemes are devised to suppress knowledge noise and improve robustness of event representations. • Experimental results demonstrate effectiveness and robustness of the method. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. A qualified description of extended fuzzy logic.

Author: Sabahi, Farnaz and Akbarzadeh-T, M.-R.
Subjects: *FUZZY logic, *APPROXIMATION theory, *INFORMATION theory, *MATHEMATICAL transformations, *MATHEMATICAL models, *CONSTRAINT satisfaction, *GENERALIZABILITY theory, *NATURAL language processing
Abstract: Abstract: The sense of approximate reasoning within fuzzy logic has given rise to speculation as to whether precise reasoning within classical logic can even be considered a feasible way of handling most of the world’s problems. Now, another question can be raised; can fuzzy logic, by itself, justify/handle the sort of uncertainty found in problems that belong to the class of open world problems? This family of problems corresponds to the case of ambiguous mathematical theories mainly due to the fact that there is just imperfect/incomplete information about them. Recently, Zadeh introduced an extension of fuzzy logic, claimed to bridge the gaps in conventional approaches to the open world problems. In this paper, we present a basic background of extended fuzzy logic (FLe). We address several main concepts such as f-transformation and f-validity, enhance the definition of generalized constraints, and propose that FLe’s rules have two layers: one based on the implication model and another based on the conjunction model. In addition, we explore how FLe can handle those conditions described in antonym pairs as well as those occurred without considering rationality in the framework. Both cases challenge most decision-making theories. We also suggest FLe and second-order probability are f-complementary rather than f-competitive. The paper asserts that FLe is highly capable of constructing mathematical solutions to problems (including computational problems) that are stated in natural language. [Copyright &y& Elsevier]
Published: 2013
Full Text: View/download PDF

35. BILU-NEMH: A BILU neural-encoded mention hypergraph for mention extraction.

Author: Lin, Jerry Chun-Wei, Shao, Yinan, Fournier-Viger, Philippe, and Hamido, Fujita
Subjects: *NATURAL language processing, *ELECTRONIC data processing, *HYPERGRAPHS, *NEURAL computers, *ENCODING
Abstract: The natural language processing (NLP) denotes a technique used to process data such as text and speech. Some of the fundamental research in NLP includes the named entity recognition, which recognizes the named entities (i.e., persons and companies) from texts, the semantic parsing, which converts a natural language utterance to a logical form, and the co-reference resolution, which extracts the nouns (including pronouns and noun phrases) pointing to the same reference body. In this paper, we focus on the mention extraction and classification, proposing a neural-encoded mention-hypergraph model named the BILU-NEMH to extract the mention entities from a content. The proposed BILU-NEMH model combines a mention hypergraph model with the encoding schema and neural network. The proposed model can effectively capture the overlapping mention entities of an unbounded length. The proposed model was verified by the experiments, and the obtained experimental results showed that the proposed model achieved better performance and greater effectiveness than the existing related models on most standard datasets. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

36. A survey on fake news and rumour detection techniques.

Author: Bondielli, Alessandro and Marcelloni, Francesco
Subjects: *FAKE news, *INFORMATION retrieval, *PUBLIC opinion, *MACHINE learning, *DATA mining
Abstract: False or unverified information spreads just like accurate information on the web, thus possibly going viral and influencing the public opinion and its decisions. Fake news and rumours represent the most popular forms of false and unverified information, respectively, and should be detected as soon as possible for avoiding their dramatic effects. The interest in effective detection techniques has been therefore growing very fast in the last years. In this paper we survey the different approaches to automatic detection of fake news and rumours proposed in the recent literature. In particular, we focus on five main aspects. First, we report and discuss the various definitions of fake news and rumours that have been considered in the literature. Second, we highlight how the collection of relevant data for performing fake news and rumours detection is problematic and we present the various approaches, which have been adopted to gather these data, as well as the publicly available datasets. Third, we describe the features that have been considered in fake news and rumour detection approaches. Fourth, we provide a comprehensive analysis on the various techniques used to perform rumour and fake news detection. Finally, we identify and discuss future directions. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

37. Recurrent neural network-based semantic variational autoencoder for Sequence-to-sequence learning.

Author: Jang, Myeongjun, Seo, Seungwan, and Kang, Pilsung
Subjects: *RECURRENT neural networks, *SEMANTICS, *NATURAL language processing, *MACHINE translating, *STANDARD deviations
Abstract: Abstract Sequence-to-sequence (Seq2seq) models have played an important role in the recent success of various natural language processing methods, such as machine translation, text summarization, and speech recognition. However, current Seq2seq models have trouble preserving global latent information from a long sequence of words. Variational autoencoder (VAE) alleviates this problem by learning a continuous semantic space of the input sentence. However, it does not solve the problem completely. In this paper, we propose a new recurrent neural network (RNN)-based Seq2seq model, RNN semantic variational autoencoder (RNN–SVAE), to better capture the global latent information of a sequence of words. To suitably reflect the meanings of words in a sentence regardless of their position within the sentence, we utilized two approaches: (1) constructing a document information vector based on the attention information between the final state of the encoder and every prior hidden state, and (2) extracting the semantic vector based on the self-attention mechanism. Then, the mean and standard deviation of the continuous semantic space are learned by using this vector to take advantage of the variational method. By using the document information vector and the self-attention mechanism to find the semantic space of the sentence, it becomes possible to better capture the global latent feature of the sentence. Experimental results of three natural language tasks (i.e., language modeling, missing word imputation, paraphrase identification) confirm that the proposed RNN–SVAE yields higher performance than two benchmark models. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

38. Group decision making with double hierarchy hesitant fuzzy linguistic preference relations: Consistency based measures, index and repairing algorithms and decision model.

Author: Gou, Xunjie, Liao, Huchang, Xu, Zeshui, Min, Rui, and Herrera, Francisco
Subjects: *GROUP decision making, *PROBLEM solving, *NATURAL language processing, *FEEDBACK control systems, *WATER supply management
Abstract: • We define double hierarchy hesitant fuzzy linguistic preference relation (DHHFLPR). • A novel method is developed to calculate the consistency thresholds. • Two consistency repairing algorithms is proposed to improve the DHHFLPR. • A method is set up to deal with GDM problems with DHHFLPRs. • The proposed method is validated by a practical group decision making problem. Group decision making, refers to inviting a group of decision makers to evaluate, prioritize or select the optimal one among some available alternatives in the actual decision making process. Considering that the double hierarchy hesitant fuzzy linguistic term set can describe natural languages clearly, in this paper, we define the concept of double hierarchy hesitant fuzzy linguistic preference relation (DHHFLPR) and propose some additive consistency measures. To judge whether a DHHFLPR is of acceptable consistency or not, we introduce a consistency index, and develop some novel threshold values for judging whether a DHHFLPR is of acceptable consistency or not. Furthermore, we develop two consistency repairing algorithms based on the automatic improving method and the feedback improving method respectively, to improve the DHHFLPR with unacceptable consistency. Additionally, a method is set up to deal with group decision making problems with double hierarchy hesitant fuzzy linguistic preference information. Finally, the proposed method is validated by a case study that is used to evaluate the water resource situations of some important cities in Sichuan Province, and some comparative analyses are given to show the efficiency of the proposed method. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

39. Deep rolling: A novel emotion prediction model for a multi-participant communication context.

Author: Rong, Huan, Ma, Tinghuai, Cao, Jie, Tian, Yuan, Al-Dhelaan, Abdullah, and Al-Rodhaan, Mznah
Subjects: *AFFECTIVE forecasting (Psychology), *COMMUNICATION, *EMOTIONS, *FACTORIZATION, *GENETIC vectors
Abstract: Highlights • Propose a novel emotion prediction model, Deep Rolling, for target participant in a multi-participant communication context. The proposed model can capture the emotional stimulation caused by other participants in the current communication context. • Introduce data factorization into the proposed model, Deep Rolling, to enhance the prediction efficiency. • By adopting movie subtitle as dataset, Deep Rolling is utilized to find roles who have the most important emotional impact on the prediction target. Abstract Nowadays, the amount of user-generated contents (UGCs) or texts has surged exponentially. Therefore, recognizing emotions from these texts can bring about lots of advantages. In this paper, we have proposed a novel model named Deep Rolling to predict emotion for target participant in a multi-participant communication context. First, the proposed method converts a text collection into a set of n -dimension vectors for emotion representation and re-organizes texts into a sequence in time order. Then, Deep Rolling can predict the emotion of target participant corresponding to a future time point. Second, apart from simply taking in texts posted by target participant via LSTM, the proposed method has also incorporated texts posted by other participants at every time step by CNN. In this way, Deep Rolling can predict target participant's emotion by processing emotions from both the target and all the other participants in an ensemble way. Finally, data factorization has also been introduced into Deep Rolling to enhance the overall prediction efficiency. According to experimental results, compared with the state-of-art methods, our proposed model has achieved the best prediction precision on different target participants. At the same time, Deep Rolling has also maintained the prediction efficiency at an acceptable level. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

40. Interactive natural language question answering over knowledge graphs.

Author: Zheng, Weiguo, Cheng, Hong, Yu, Jeffrey Xu, Zou, Lei, and Zhao, Kangfei
Subjects: *NATURAL language processing, *GRAPH theory, *KNOWLEDGE management, *QUERYING (Computer science), *INFORMATION theory
Abstract: Abstract As many real-world data are constructed into knowledge graphs, providing effective and convenient query techniques for end users is an urgent and important task. Although structured query languages, such as SPARQL, offer a powerful expression ability to query RDF datasets, they are difficult to use. Keywords are simple but have a very limited expression ability. Natural language question (NLQ) is promising for querying knowledge graphs. A huge challenge is how to understand the question clearly so as to translate the unstructured question into a structured query. In this paper, we present a data + oracle approach to answer NLQs over knowledge graphs. We let users verify the ambiguities during the query understanding. To reduce the interaction cost, we formalize an interaction problem and design an efficient strategy to solve the problem. We also propose a query prefetching technique by exploiting the latency in the interactions with users. Moreover, we devise a hybrid approach that incorporates NLP-based, data-driven, and interaction techniques together to complete the question understanding. Extensive experiments over real datasets demonstrate that our proposed approach is effective as it outperforms state-of-the-art methods significantly. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

41. Cross-portal metadata alignment – Connecting open data portals through means of formal concept analysis.

Author: Bogdanović, Miloš, Gligorijević, Milena Frtunić, Veljković, Nataša, Puflović, Darko, and Stoimenov, Leonid
Subjects: *NATURAL language processing, *DATA structures, *METADATA, *SEQUENCE alignment, *GENE ontology
Abstract: • An approach for cleaning and reconciliation of tags in open data portal (ODP) categories using natural language processing methods. • A data structure for connecting ODPs – cross-portal metadata alignment structure generated using ODPs' tags. • Formal context reduction based on semantic similarity measure through means of natural language processing. • Cross-portal dataset categorization - determining appropriate category for a dataset using the metadata alignment structure. • Quality assessment using datasets obtained from Canada's and New Zealand's ODP. Due to openness and transparency initiatives, a vast amount of data is being made publicly available. This data has great significance for business and society. However, it also led to challenges that need to be overcome for this data to reach its full potential. In this paper, we are focusing on the problem of connecting open data portals (ODPs) through metadata alignment. We investigate the available metadata accompanying datasets, especially the part related to categories datasets belong to and tags that closely describe datasets. The methodology we propose is relying on Formal Concept Analysis for the creation of the hierarchical structure used for determining the similarity of tags' usage in different ODPs. We propose such a structure to be used for open data portal metadata alignment. Further, we apply semantic similarity measures to reduce the complexity of the cross-portal data structure while preserving all its characteristics. We demonstrate how our approach can be used for determining dataset category across multiple ODPs aligned using the data structure our approach generates. We envision our approach to improve cross-portal search and metadata enrichment through open data categorization. Lastly, the quality of our approach was tested using datasets obtained from Canada's and New Zealand's ODPs. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

42. A sentiment information Collector–Extractor architecture based neural network for sentiment analysis.

Author: Shuang, Kai, Zhang, Zhixuan, Guo, Hao, and Loo, Jonathan
Subjects: *SENTIMENT analysis, *NATURAL language processing, *NEURAL circuitry, *LEARNING, *MEMORY
Abstract: Abstract Sentiment analysis, also known as opinion mining is a key natural language processing (NLP) task that receives much attention these years, where deep learning based neural network models have achieved great success. However, the existing deep learning models cannot effectively make use of the sentiment information in the sentence for sentiment analysis. In this paper, we propose a Sentiment Information Collector–Extractor architecture based Neural Network (SICENN) for sentiment analysis consisting of a Sentiment Information Collector (SIC) and a Sentiment Information Extractor (SIE). The SIC based on the Bi-directional Long Short Term Memory structure aims at collecting the sentiment information in the sentence and generating the information matrix. The SIE takes the information matrix as input and extracts the sentiment information precisely via three different sub-extractors. A new ensemble strategy is applied to combine the results of different sub-extractors, making the SIE more universal and outperform any single sub-extractor. Experiments results show that the proposed architecture outperforms the state-of-the-art methods on three datasets of different language. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

43. Aspect-based opinion ranking framework for product reviews using a Spearman's rank correlation coefficient method.

Author: J, Ashok Kumar and S, Abirami
Subjects: *SENTIMENT analysis, *NATURAL language processing, *SOCIAL media, *SUPERVISED learning, *MACHINE learning, *SUPPORT vector machines, *MAXIMUM entropy method
Abstract: Opinion mining (also called sentiment analysis) is a type of natural language processing for computing people's opinions and emotions. It detects opinions from structured, semi-structured, and unstructured social media contents at different levels, such as the document, word, sentence, and aspect levels. In all these levels except aspect, opinion mining identifies the overall subjectivity or sentiment polarities. An aspect level is described as a part or an attribute of an entity. It exactly describes people's likes and dislikes in social media contents. In this paper, we propose a new framework for ranking products based on aspects. First, the system identifies the aspects of products. Second, the aspects and their opinion words are identified and visualized from the products’ reviews using a Harel–Koren fast multiscale layout. Third, the network visualization is constructed and modeled, and a Spearman's rank correlation coefficient based opinion ranking method is applied to rank the products based on positive and negative ranks. Fourth, the supervised learning methods (Naïve Bayes, Maximum Entropy, and Support Vector Machine) are employed for the aspect-based sentiment classification task. Finally, the performance of the system is measured by the experimental results. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

44. Integrating flexibility and fuzziness into a question driven query model.

Author: Sarhan, Abdullah, Rokne, Jon, and Alhajj, Reda
Subjects: *RELATIONAL databases, *FUZZY logic, *QUERYING (Computer science), *NATURAL language processing, *DATA mining, *MANAGEMENT
Abstract: The wide usage of relational databases motivated researchers to develop more user friendly interfaces which would allow a larger population of users to access databases. Such interfaces range from visual to natural language based. This paper contributes a question driven query model which falls under the natural language based category. The proposed model supports fuzziness where every user is given the freedom to define his/her own understanding of fuzzy terms. The developed system captures the fuzzy understanding of each user to utilize it while deciding on the result to be communicated back as answer to a raised question. Data mining techniques are employed to guide users in defining their fuzzy understanding. The developed model is intended to help users to retrieve data from a relational database without expecting them to know SQL. The system handles different types of questions, including (1) simple questions, (2) complex questions with inner joins and where conditions, (3) questions that involve usage of aggregate functions (e.g., min, max, etc.), and (4) questions with fuzzy terms. The reported test results demonstrate the effectiveness and applicability of the developed system in handling various types of questions raised by a heterogeneous set of users ranging from professional to naive. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

45. A neural generative autoencoder for bilingual word embeddings.

Author: Su, Jinsong, Wu, Shan, Zhang, Biao, Wu, Changxing, Qin, Yue, and Xiong, Deyi
Subjects: *BILINGUALISM, *ENCODING, *EMBEDDINGS (Mathematics), *NATURAL language processing, *ARTIFICIAL neural networks
Abstract: Bilingual word embeddings (BWEs) have been shown to be useful in various cross-lingual natural language processing tasks. To accurately learn BWEs, previous studies often resort to discriminative approaches which explore semantic proximities between translation equivalents of different languages. Instead, in this paper, we propose a neural generative bilingual autoencoder (NGBAE) which introduces a latent variable to explicitly induce the underlying semantics of bilingual text. In this way, NGBAE is able to obtain better BWEs from more robust bilingual semantics by modeling the semantic distributions of bilingual text. In order to facilitate scalable inference and learning, we utilize deep neural networks to perform the recognition and generation procedures, and then employ stochastic gradient variational Bayes algorithm to optimize them jointly. We validate the proposed model via both extrinsic (cross-lingual document classification and translation probability modeling) and intrinsic (word embedding analysis) evaluations. Experimental results demonstrate the effectiveness of NGBAE on learning BWEs. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

46. Self-Organised direction aware data partitioning algorithm.

Author: Gu, Xiaowei, Angelov, Plamen, Kangin, Dmitry, and Principe, Jose
Subjects: *NATURAL language processing, *ELECTRONIC data processing management, *COSINE function, *PARALLEL algorithms, *EUCLIDEAN algorithm
Abstract: In this paper, a novel fully data-driven algorithm, named Self-Organised Direction Aware ( SODA ) data partitioning and forming data clouds is proposed. The proposed SODA algorithm employs an extra cosine similarity-based directional component to work together with a traditional distance metric, thus, takes the advantages of both the spatial and angular divergences. Using the nonparametric Empirical Data Analytics (EDA) operators, the proposed algorithm automatically identifies the main modes of the data pattern from the empirically observed data samples and uses them as focal points to form data clouds. A streaming data processing extension of the SODA algorithm is also proposed. This extension of the SODA algorithm is able to self-adjust the data clouds structure and parameters to follow the possibly changing data patterns and processes. Numerical examples provided as a proof of the concept illustrate the proposed algorithm as an autonomous algorithm and demonstrate its high clustering performance and computational efficiency. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

47. Controversy detection in Wikipedia using semantic dissimilarity.

Author: Jhandir, M. Zeeshan, Tenvir, Ali, On, Byung-Won, Lee, Ingyu, and Choi, Gyu Sang
Subjects: *SEARCH engines, *NATURAL language processing, *WIKIS, *SEMANTICS
Abstract: The advent of search engines and wikis has made access to information easy and almost free. Wikipedia is the efficacious outcome of an enormous collaboration, and its peer review-like methods of creation, maintenance, and evolution of contents, ensure high quality and reliability. However, the “anyone-can-edit” policy of Wikipedia has created many problems such as trolling, vandalism, controversies, and doubts about the content and reliability of the information provided due to non-expert involvement. People have tried to identify and rank controversies in Wikipedia articles through various techniques that use quantitative data, ignoring the semantic significance of conflicts among authors. In this paper, we have addressed the problem of identifying controversy using natural language processing techniques for the first time. The proposed method spots the impact on existing meanings of the text due to new editing processes along with their relationship to the topic of the article. The experimental results for precision (0.901), recall (0.901), accuracy (0.908), and F-measure (0.901) demonstrate the effectiveness of the proposed method. The technique is deemed useful for automatic identification of conflicts newly introduced into existing article contents, and could prove helpful in making decisions for inclusion or exclusion of controversies under the same topic. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

48. Judgment analysis of crowdsourced opinions using biclustering.

Author: Chatterjee, Sujoy and Bhattacharyya, Malay
Subjects: *CROWDSOURCING, *CLUSTER analysis (Statistics), *PROBLEM solving, *DISTRIBUTION (Probability theory), *NATURAL language processing, *IMAGE processing
Abstract: Annotation by the crowd workers serving online is gaining focus in recent years in diverse fields due to its distributed power of problem solving. Distributing the labeling task among a large set of workers (may be experts or non-experts) and obtaining the final consensus is a popular way of performing large-scale annotation in a limited time. Collection of multiple annotations can be effective for annotation of large-scale datasets for applications like natural language processing, image processing, etc. However, as the crowd workers are not necessarily experts, their opinions might not be accurate enough. This causes problem in deriving the final aggregated judgment. Again, majority voting (MV) is not suitable for such problems because the number of annotators is limited and they have multiple options to choose. This might cause too much conflicts among the opinions provided. Additionally, there might exist annotators who randomly try to annotate (provide spam opinions for) too many questions to maximize their payment. This can incorporate noise while deriving the final judgment. In this paper, we address the problem of crowd judgment analysis in an unsupervised way and a biclustering-based approach is proposed to obtain the judgments appropriately. The effectiveness of this approach is demonstrated on four publicly available small-scale Amazon Mechanical Turk datasets, along with a large-scale CrowdFlower dataset. We also compare the algorithm with MV and some other existing algorithms. In most of the cases the proposed approach is competitively better than others. But most importantly, it does not use the entire dataset for deriving the judgment. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

49. A competence-performance based model to develop a syntactic language for artificial agents.

Author: Mingo, Jack Mario and Aler, Ricardo
Subjects: *SEMANTICS, *HYPOTHESIS, *NATURAL language processing, *REINFORCEMENT learning, *GAME theory
Abstract: The hypothesis of language use is an attractive theory in order to explain how natural languages evolve and develop in social populations. In this paper we present a model partially based on the idea of language games, so that a group of artificial agents are able to produce and share a symbolic language with syntactic structure. Grammatical structure is induced by grammatical evolution of stochastic regular grammars with learning capabilities, while language development is refined by means of language games where the agents apply on-line probabilistic reinforcement learning. Within this framework, the model adapts the concepts of competence and performance in language, as they have been proposed in some linguistic theories. The first experiments in this article have been organized around the linguistic description of visual scenes with the possibility of changing the referential situations. A second and more complicated experimental setting is also analyzed, where linguistic descriptions are enforced to keep word order constraints. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

50. Mathematical properties of soft cardinality: Enhancing Jaccard, Dice and cosine similarity measures with element-wise distance.

Author: Jimenez, Sergio, Gonzalez, Fabio A., and Gelbukh, Alexander
Subjects: *CARDINAL numbers, *COSINE function, *SIMILARITY (Geometry), *MEASURE theory, *SET theory, *NATURAL language processing
Abstract: The soft cardinality function generalizes the concept of counting measure of the classic cardinality of sets. This function provides an intuitive measure of the amount of elements in a collection (i.e. a set or a bag) exploiting the similarities among them. Although soft cardinality was first proposed in an ad-hoc way, it has been successfully used in various tasks in the field of natural language processing. In this paper, a formal definition of soft cardinality is proposed together with an analysis of its boundaries, monotonicity property and a method for constructing similarity functions. Additionally, an empirical evaluation of the model was carried out using synthetic data. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Region

Database

66 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources