241 results on '"ALBERT"'
Search Results
2. Exploring transformer models for sentiment classification: A comparison of BERT, RoBERTa, ALBERT, DistilBERT, and XLNet.
- Author
-
Areshey, Ali and Mathkour, Hassan
- Subjects
- *
LANGUAGE models , *TRANSFORMER models , *STATISTICAL learning , *SENTIMENT analysis , *MACHINE learning - Abstract
Transfer learning models have proven superior to classical machine learning approaches in various text classification tasks, such as sentiment analysis, question answering, news categorization, and natural language inference. Recently, these models have shown exceptional results in natural language understanding (NLU). Advanced attention‐based language models like BERT and XLNet excel at handling complex tasks across diverse contexts. However, they encounter difficulties when applied to specific domains. Platforms like Facebook, characterized by continually evolving casual and sophisticated language, demand meticulous context analysis even from human users. The literature has proposed numerous solutions using statistical and machine learning techniques to predict the sentiment (positive or negative) of online customer reviews, but most of them rely on various business, review, and reviewer features, which leads to generalizability issues. Furthermore, there have been very few studies investigating the effectiveness of state‐of‐the‐art pre‐trained language models for sentiment classification in reviews. Therefore, this study aims to assess the effectiveness of BERT, RoBERTa, ALBERT, DistilBERT, and XLNet in sentiment classification using the Yelp reviews dataset. The models were fine‐tuned, and the results obtained with the same hyperparameters are as follows: 98.30 for RoBERTa, 98.20 for XLNet, 97.40 for BERT, 97.20 for ALBERT, and 96.00 for DistilBERT. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Aspect-Level Sentiment Analysis Based on Lite Bidirectional Encoder Representations From Transformers and Graph Attention Networks.
- Author
-
Xu, Longming, Xiao, Ping, and Zeng, Huixia
- Subjects
- *
LANGUAGE models , *SENTIMENT analysis , *INFORMATION networks - Abstract
Aspect-level sentiment analysis is a critical component of sentiment analysis, aiming to determine the sentiment polarity associated with specific aspect words. However, existing methodologies have limitations in effectively managing aspect-level sentiment analysis. These limitations include insufficient utilization of syntactic information and an inability to precisely capture the contextual nuances surrounding aspect words. To address these issues, we propose an Aspect-Oriented Graph Attention Network (AOGAT) model. This model incorporates syntactic information to generate dynamic word vectors through the pre-trained model ALBERT and combines a graph attention network with BiGRU to capture both syntactic and semantic features. Additionally, the model introduces an aspect-focused attention mechanism to retrieve features related to aspect words and integrates the generated representations for sentiment classification. Our experiments on three datasets demonstrate that the AOGAT model outperforms traditional models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model.
- Author
-
Um, Taehum and Kim, Namhyoung
- Subjects
LANGUAGE models ,ARTIFICIAL neural networks ,TRANSFORMER models ,LATENT variables ,STOCHASTIC models - Abstract
As an extension of the transformer architecture, the BERT model has introduced a new paradigm for natural language processing, achieving impressive results in various downstream tasks. However, high-performance BERT-based models—such as ELECTRA, ALBERT, and RoBERTa—suffer from limitations such as poor continuous learning capability and insufficient understanding of domain-specific documents. To address these issues, we propose the use of an attention mechanism to combine BERT-based models with neural topic models. Unlike traditional stochastic topic modeling, neural topic modeling employs artificial neural networks to learn topic representations. Furthermore, neural topic models can be integrated with other neural models and trained to identify latent variables in documents, thereby enabling BERT-based models to sufficiently comprehend the contexts of specific fields. We conducted experiments on three datasets—Movie Review Dataset (MRD), 20Newsgroups, and YELP—to evaluate our model's performance. Compared to the vanilla model, the proposed model achieved an accuracy improvement of 1–2% for the ALBERT model in multiclassification tasks across all three datasets, while the ELECTRA model showed an accuracy improvement of less than 1%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Multi-Modal Sentiment Analysis Based on Image and Text Fusion Based on Cross-Attention Mechanism.
- Author
-
Li, Hongchan, Lu, Yantong, and Zhu, Haodong
- Subjects
SENTIMENT analysis ,IMAGE analysis ,USER-generated content ,TEXT recognition ,FEATURE extraction ,IMAGE fusion - Abstract
Research on uni-modal sentiment analysis has achieved great success, but emotions in real life are mostly multi-modal; there are not only texts but also images, audio, video, and other forms. The various modes play a role in mutual promotion. If the connection between various modalities can be mined, the accuracy of sentiment analysis will be further improved. To this end, this paper introduces a cross-attention-based multi-modal fusion model for images and text, namely, MCAM. First, we use the ALBert pre-training model to extract text features for text; then, we use BiLSTM to extract text context features; then, we use DenseNet121 to extract image features for images; and then, we use CBAM to extract specific areas related to emotion in images. Finally, we utilize multi-modal cross-attention to fuse the extracted features from the text and image, and we classify the output to determine the emotional polarity. In the experimental comparative analysis of MVSA and TumEmo public datasets, the model in this article is better than the baseline model, with accuracy and F1 scores reaching 86.5% and 75.3% and 85.5% and 76.7%, respectively. In addition, we also conducted ablation experiments, which confirmed that sentiment analysis with multi-modal fusion is better than single-modal sentiment analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Graphic association learning: Multimodal feature extraction and fusion of image and text using artificial intelligence techniques
- Author
-
Guangyun Lu, Zhiping Ni, Ling Wei, Junwei Cheng, and Wei Huang
- Subjects
Text matching ,Image matching ,ALBERT ,Mask R-CNN ,DCGAN ,Multimodal feature ,Science (General) ,Q1-390 ,Social sciences (General) ,H1-99 - Abstract
With the advancement of technology in recent years, the application of artificial intelligence in real life has become more extensive. Graphic recognition is a hot spot in the current research of related technologies. It involves machines extracting key information from pictures and combining it with natural language processing for in-depth understanding. Existing methods still have obvious deficiencies in fine-grained recognition and deep understanding of contextual context. Addressing these issues to achieve high-quality image-text recognition is crucial for various application scenarios, such as accessibility technologies, content creation, and virtual assistants. To tackle this challenge, a novel approach is proposed that combines the Mask R-CNN, DCGAN, and ALBERT models. Specifically, the Mask R-CNN specializes in high-precision image recognition and segmentation, the DCGAN captures and generates nuanced features from images, and the ALBERT model is responsible for deep natural language processing and semantic understanding of this visual information. Experimental results clearly validate the superiority of this method. Compared to traditional image-text recognition techniques, the recognition accuracy is improved from 85.3% to 92.5%, and performance in contextual and situational understanding is enhanced. The advancement of this technology has far-reaching implications for research in machine vision and natural language processing and open new possibilities for practical applications.
- Published
- 2024
- Full Text
- View/download PDF
7. Multilingual Question Answering for Malaysia History with Transformer-based Language Model
- Author
-
Qi Zhi Lim, Chin Poo Lee, Kian Ming Lim, Jing Xiang Ng, Eric Khang Heng Ooi, and Nicole Kai Ning Loh
- Subjects
question answering ,historical knowledge ,natural language processing ,debertav3 ,bert ,albert ,electra ,minilm ,roberta. ,Technology (General) ,T1-995 ,Social sciences (General) ,H1-99 - Abstract
In natural language processing (NLP), a Question Answering System (QAS) refers to a system or model that is designed to understand and respond to user queries in natural language. As we navigate through the recent advancements in QAS, it can be observed that there is a paradigm shift of the methods used from traditional machine learning and deep learning approaches towards transformer-based language models. While significant progress has been made, the utilization of these models for historical QAS and the development of QAS for Malay language remain largely unexplored. This research aims to bridge the gaps, focusing on developing a Multilingual QAS for history of Malaysia by utilizing a transformer-based language model. The system development process encompasses various stages, including data collection, knowledge representation, data loading and pre-processing, document indexing and storing, and the establishment of a querying pipeline with the retriever and reader. A dataset with a collection of 100 articles, including web blogs related to the history of Malaysia, has been constructed, serving as the knowledge base for the proposed QAS. A significant aspect of this research is the use of the translated dataset in English instead of the raw dataset in Malay. This decision was made to leverage the effectiveness of well-established retriever and reader models that were trained on English data. Moreover, an evaluation dataset comprising 100 question-answer pairs has been created to evaluate the performance of the models. A comparative analysis of six different transformer-based language models, namely DeBERTaV3, BERT, ALBERT, ELECTRA, MiniLM, and RoBERTa, has been conducted, where the effectiveness of the models was examined through a series of experiments to determine the best reader model for the proposed QAS. The experimental results reveal that the proposed QAS achieved the best performance when employing RoBERTa as the reader model. Finally, the proposed QAS was deployed on Discord and equipped with multilingual support through the incorporation of language detection and translation modules, enabling it to handle queries in both Malay and English. Doi: 10.28991/ESJ-2024-08-02-019 Full Text: PDF
- Published
- 2024
- Full Text
- View/download PDF
8. Advanced Analysis of Learning-Based Spam Email Filtering Methods Based on Feature Distribution Differences of Dataset
- Author
-
Jin-Seong Kim, Han-Jin Lee, Han-Ju Lee, and Seok-Hwan Choi
- Subjects
Spam email filtering ,recurrent neural network (RNN) ,gated recurrent unit (GRU) ,long short-term memory (LSTM) ,ALBERT ,security ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Spam emails, which are unsolicited bulk emails, pose a significant threat in digital communication security. To counter spam emails, learning-based spam email filtering methods have been extensively studied. However, as spam patterns evolve, these methods face challenges in maintaining the accuracy of models trained on outdated patterns. To demonstrate these limitations empirically and gain insight into the classification patterns of spam email filtering models, we propose an advanced analysis method to analyze the performance degradation of spam email filtering models. The proposed analysis method involves text preprocessing, embedding model training, spam email filtering model training, evaluation, and analysis of the classification patterns of the learning-based spam email filtering models. From the experimental results under various datasets and spam email filtering models, we show that the accuracy of spam email filtering models significantly decreases when the feature distribution of the test dataset is different from the training dataset. We also provides valuable insights for improving the model architecture, dataset structure, and training strategies by analysis of various factors such as confusion matrix, performance metrics, mean sequence length, out-of-vocabulary (OOV) rate, and top-20 tokens.
- Published
- 2024
- Full Text
- View/download PDF
9. A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model
- Author
-
Taehum Um and Namhyoung Kim
- Subjects
natural language processing ,neural topic model ,ELECTRA ,ALBERT ,multi-classification ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
As an extension of the transformer architecture, the BERT model has introduced a new paradigm for natural language processing, achieving impressive results in various downstream tasks. However, high-performance BERT-based models—such as ELECTRA, ALBERT, and RoBERTa—suffer from limitations such as poor continuous learning capability and insufficient understanding of domain-specific documents. To address these issues, we propose the use of an attention mechanism to combine BERT-based models with neural topic models. Unlike traditional stochastic topic modeling, neural topic modeling employs artificial neural networks to learn topic representations. Furthermore, neural topic models can be integrated with other neural models and trained to identify latent variables in documents, thereby enabling BERT-based models to sufficiently comprehend the contexts of specific fields. We conducted experiments on three datasets—Movie Review Dataset (MRD), 20Newsgroups, and YELP—to evaluate our model’s performance. Compared to the vanilla model, the proposed model achieved an accuracy improvement of 1–2% for the ALBERT model in multiclassification tasks across all three datasets, while the ELECTRA model showed an accuracy improvement of less than 1%.
- Published
- 2024
- Full Text
- View/download PDF
10. Named Entity Recognition of Wheat Diseases and Pests Fusing ALBERT and Rules
- Author
-
LIU Hebing, ZHANG Demeng, XIONG Shufeng, MA Xinming, XI Lei
- Subjects
wheat diseases and pests ,data augmentation ,named entity recognition (ner) ,albert ,rules amendment ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Named entity recognition of wheat diseases and pests is a key step to building a knowledge graph. Aiming at the problems of lack of training data, complex entity structure, diverse entity types and uneven entity distribution in wheat diseases and pests field, under the promise of fully mining the implicit knowledge, two data augmentation methods are used to expand sentence semantic information, and to construct the corpus WpdCNER (wheat pests and diseases Chinese named entity recognition) and the field lexicon WpdDict (wheat pests and diseases dictionary). And 16 categories of entities are defined with the field experts’ guidance. Meanwhile, Chinese named entity recognition model based on rules amendment WPD-RA (wheat pests and disease-rules amendment model) is proposed. This model is carried out entity recognition based on ALBERT+BiLSTM+CRF (a lite bi-directional encoder representation from transformer + bi-directional long short-term memory + conditional random field), and specific rules are defined to amend entity boundaries after recognition. The WPD-RA model achieves the best results with 94.72% precision, 95.23% recall, and 94.97% F1. Its precision is increased by 1.71 percentage points, recall is increased by 0.34 percentage points, and F1 is increased by 1.03 percentage points, compared with the model without rules. Experimental results show that the model can effectively recognize named entities in wheat diseases and pests field, and its performance is better than other models. The proposed model provides a reference idea for named entity recognition task in other fields such as food safety and biology.
- Published
- 2023
- Full Text
- View/download PDF
11. A study of deep semantic matching in question-and-answer events in civil litigation in the environmental justice system
- Author
-
Zhu Xiaomiao
- Subjects
albert ,bert ,semantic matching ,question categorization ,civil litigation ,68m11 ,Mathematics ,QA1-939 - Abstract
Information retrieval and text mining fields extensively utilize text semantic matching models. In this paper, civil litigation Q&A under the environmental justice system is taken as a specific research field, and after constructing a civil litigation Q&A system based on deep learning, two of the key techniques—question categorization and semantic matching—are selected as the main research content. Specifically, the ALBERT algorithm is used to extract word vectors, and the hidden feature vectors are obtained through BiLSTM modeling of contextual relationships and then combined with the Attention mechanism for scoring and weighting to obtain the final text-level vectors for classification so as to establish the civil litigation question classification model based on ALBERT. Then, we establish the BERT-based civil litigation question and answer matching model by sorting the set of candidate answers by semantic matching degree based on the BERT algorithm. Selected datasets and comparison algorithms are experimented with, and the analysis shows that the question classification model has a better effect than civil litigation question text classification, and the values of each index have been improved by 0.75%~3.00% on the basis of the baseline model. The MAP and MRR values (0.76~0.86) of the question-matching model are higher than those of the comparison model, verifying its superior performance in semantically assigning characters. The model proposed in this paper is more useful because it can provide civil litigation counseling to the public.
- Published
- 2024
- Full Text
- View/download PDF
12. Analysis of the multiple dimensions of ideological education in Marxist theory
- Author
-
Jiang Li
- Subjects
albert ,sabl ,textual emotions ,ideological education ,dimensions of identity ,97b20 ,Mathematics ,QA1-939 - Abstract
A model is utilized in this paper to analyze the textual emotions of ideology in Marxist theory. Long and short-term memory networks are chosen as the main method of text analysis to construct the main process of Marxist ideology education. Combined with the hybrid self-attention mechanism, the efficiency of extracting data features from the text was improved. The results show that the ALBERT-SABL-based sentiment analysis model is 86.9% accurate in extracting the sentiment of the ideology, and the F1 value is 87.6%. Compared with TextCnn, the accuracy has improved by 1.8%. Different schools have different levels of identification with Marx’s ideology, and under the identification dimension, the highest sentiment identification dimension among the eight school samples is School 2, with a dimension of 100. This study provides reference data in the ideological education of Marxist theory and promotes the development of Marx’s ideology.
- Published
- 2024
- Full Text
- View/download PDF
13. Chinese Named Entity Recognition in Football Based on ALBERT-BiLSTM Model.
- Author
-
An, Qi, Pan, Bingyu, Liu, Zhitong, Du, Shutong, and Cui, Yixiong
- Subjects
DEEP learning ,SOCCER ,TEXT mining ,RANDOM fields - Abstract
Football is one of the most popular sports in the world, arousing a wide range of research topics related to its off- and on-the-pitch performance. The extraction of football entities from football news helps to construct sports frameworks, integrate sports resources, and timely capture the dynamics of the sports through visual text mining results, including the connections among football players, football clubs, and football competitions, and it is of great convenience to observe and analyze the developmental tendencies of football. Therefore, in this paper, we constructed a 1000,000-word Chinese corpus in the field of football and proposed a BiLSTM-based model for named entity recognition. The ALBERT-BiLSTM combination model of deep learning is used for entity extraction of football textual data. Based on the BiLSTM model, we introduced ALBERT as a pre-training model to extract character and enhance the generalization ability of word embedding vectors. We then compared the results of two different annotation schemes, BIO and BIOE, and two deep learning models, ALBERT-BiLSTM-CRF and ALBERT BiLSTM. It was verified that the BIOE tagging was superior than BIO, and the ALBERT-BiLSTM model was more suitable for football datasets. The precision, recall, and F-Score of the model were 85.4%, 83.47%, and 84.37%, correspondingly. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. 融合ALBERT与规则的小麦病虫害命名实体识别.
- Author
-
刘合兵, 张德梦, 熊蜀峰, 马新明, and 席磊
- Abstract
Copyright of Journal of Frontiers of Computer Science & Technology is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
15. MultiHop attention for knowledge diagnosis of mathematics examination.
- Author
-
He, Xinyu, Zhang, Tongxuan, and Zhang, Guiyun
- Subjects
MATHEMATICS exams ,ARTIFICIAL intelligence ,DIAGNOSIS - Abstract
Intelligent educational diagnosis can effectively promote the development of artificial intelligence in education. The knowledge diagnosis of specific domains (e.g., mathematics, physics) plays an important role in intelligent educational diagnosis but typically relies on complex semantic information. Most existing methods only produce single sentence representations that have difficulty detecting multiple knowledge points from text. The resources of knowledge point diagnosis of specific domains are also relatively sparse. In this study, we build a dataset about mathematics that is collected from real mathematical examination and artificially annotated 18 knowledge points. We also propose the MultiHop Attention mechanism (MHA) model to focus on different important information in mathematical questions using a multiple attention mechanism. Each attention mechanism obtains different attention weights for different parts of mathematical questions. The MHA allows us to effectively obtain a comprehensive semantic representation of mathematical questions. Additionally, because the ALBERT model is advanced and efficient, we use it for word embedding in this study. The proposed method synthetically considers multiple keywords related to knowledge points in mathematical questions for knowledge diagnosis research. Experimental results with the proposed mathematical dataset show that MHA achieves marked improvements compared to existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. West-östlich diplomacy and connoisseurship in the late Habsburg Empire: Baron Albert Eperjesy and his dispersed collection of Persian art
- Author
-
Iván Szántó
- Subjects
austro-hungarian empire ,bozen/bolzano ,collecting ,amīr khusraw ,eperjesy ,albert ,govardhan ,mughal art ,persian art ,persian calligraphy ,qājār dynasty ,tehran ,tyrol ,Arts in general ,NX1-820 ,Anthropology ,GN1-890 - Abstract
The purpose of this essay is threefold. Firstly, it attempts to introduce the diplomatic and collecting careers of the Austro-Hungarian diplomat Baron Albert Eperjesy (1848–1916), who was the highest representative of his country in numerous European capitals and –between 1895 and 1901– Tehran. Secondly, an attempt will be made to contextualise his collecting habits by drawing attention to the peculiarities of Austro-Hungarian collector diplomats. Finally, and perhaps most importantly, the Persian element of this collection will be discussed within the previously outlined framework, namely, what artworks it did include, how and where he obtained them, and what would be their subsequent fate.
- Published
- 2023
- Full Text
- View/download PDF
17. Named Entity Recognition Model Based on Feature Fusion.
- Author
-
Sun, Zhen and Li, Xinfu
- Subjects
- *
RANDOM fields , *MACHINE learning - Abstract
Named entity recognition can deeply explore semantic features and enhance the ability of vector representation of text data. This paper proposes a named entity recognition method based on multi-head attention to aim at the problem of fuzzy lexical boundary in Chinese named entity recognition. Firstly, Word2vec is used to extract word vectors, HMM is used to extract boundary vectors, ALBERT is used to extract character vectors, the Feedforward-attention mechanism is used to fuse the three vectors, and then the fused vectors representation is used to remove features by BiLSTM. Then multi-head attention is used to mine the potential word information in the text features. Finally, the text label classification results are output after the conditional random field screening. Through the verification of WeiboNER, MSRA, and CLUENER2020 datasets, the results show that the proposed algorithm can effectively improve the performance of named entity recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. Microblog Text Emotion Classification Algorithm Based on TCN-BiGRU and Dual Attention.
- Author
-
Qin, Yao, Shi, Yiping, Hao, Xinze, and Liu, Jin
- Subjects
- *
EMOTIONS , *FEATURE extraction , *MACHINE learning , *PUBLIC opinion , *SENTIMENT analysis , *NAIVE Bayes classification - Abstract
Microblog is an important platform for mining public opinion, and it is of great value to conduct emotional analysis of microblog texts during the current epidemic. Aiming at the problem that most current emotional classification methods cannot effectively extract deep text features, and that traditional word vectors cannot dynamically obtain the semantics of words according to their context, which leads to classification bias, this research put forward a microblog text emotion classification algorithm based on TCN-BiGRU and dual attention (TCN-BiGRU-DATT). First, the vector representation of the text was obtained using ALBERT. Second, the TCN and BiGRU networks were used to extract the emotional information contained in the text through dual pathway feature extraction, to efficiently obtain the deep semantic features of the text. Then, the dual attention mechanism was introduced to allocate the global weight of the key information in the semantic features, and the emotional features were spliced and fused. Finally, the Softmax classifier was applied for emotion classification. The findings of a comparative experiment on a set of microblog text comments collected throughout the pandemic revealed that the accuracy, recall, and F1 value of the emotion classification method proposed in this paper reached 92.33%, 91.78%, and 91.52%, respectively, which was a significant improvement compared with other models. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Medical QA Oriented Multi-Task Learning Model for Question Intent Classification and Named Entity Recognition.
- Author
-
Tohti, Turdi, Abdurxit, Mamatjan, and Hamdulla, Askar
- Subjects
- *
NATURAL language processing , *QUESTION answering systems , *NATURAL languages - Abstract
Intent classification and named entity recognition of medical questions are two key subtasks of the natural language understanding module in the question answering system. Most existing methods usually treat medical queries intent classification and named entity recognition as two separate tasks, ignoring the close relationship between the two tasks. In order to optimize the effect of medical queries intent classification and named entity recognition tasks, a multi-task learning model based on ALBERT-BILSTM is proposed for intent classification and named entity recognition of Chinese online medical questions. The multi-task learning model in this paper makes use of encoder parameter sharing, which enables the model's underlying network to take into account both named entity recognition and intent classification features. The model learns the shared information between the two tasks while maintaining its unique characteristics during the decoding phase. The ALBERT pre-training language model is used to obtain word vectors containing semantic information and the bidirectional LSTM network is used for training. A comparative experiment of different models was conducted on Chinese medical questions dataset. Experimental results show that the proposed multi-task learning method outperforms the benchmark method in terms of precision, recall and F1 value. Compared with the single-task model, the generalization ability of the model has been improved. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. Spectral-Spatial Classification of Hyperspectral Images Using BERT-Based Methods With HyperSLIC Segment Embeddings
- Author
-
Ibrahim Onur Sigirci and Gokhan Bilgin
- Subjects
ALBERT ,BERT ,classification ,hyperspectral images ,hyperSLIC ,segmentation ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The classification performance is highly affected because hyperspectral images include many bands, have high dimensions, and have few labeled training samples. This challenge is reduced by using rich spatial information and an effective classifier. The classifiers in this study are BERT-based (Bidirectional Encoder Representations from Transformers) models, which have recently been applied in natural language processing. The BERT model and its performance-improved version, the ALBERT (A Lite BERT) model, are utilized as transformer-based models. Because of their structure, these models can also accept spatial information via ‘segment embeddings’. Segmentation algorithms are commonly used in the literature to get spatial information. Superpixel methods have shown superior results in the segmentation literature due to the utility of working at the superpixel level rather than the conventional pixel level. HyperSLIC, a modified version of the SLIC superpixel method for hyperspectral images, is employed as input to BERT-based models in this study. In addition, HyperSLIC segmentation results are merged with the DBSCAN algorithm for similar superpixels to increase the size of spatially similar areas and called as HyperSLIC-DBSCAN. The effects of segment embedding information on classification accuracy in BERT-based models is studied experimentally. Experimental results show that BERT-based models outperform conventional and deep learning-based 1D/2D convolutional neural network classifiers when spatial information is used with the help of segment embedding information.
- Published
- 2022
- Full Text
- View/download PDF
21. Chinese Named Entity Recognition in Football Based on ALBERT-BiLSTM Model
- Author
-
Qi An, Bingyu Pan, Zhitong Liu, Shutong Du, and Yixiong Cui
- Subjects
named entity recognize ,ALBERT ,BiLSTM ,deep learning ,football ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Football is one of the most popular sports in the world, arousing a wide range of research topics related to its off- and on-the-pitch performance. The extraction of football entities from football news helps to construct sports frameworks, integrate sports resources, and timely capture the dynamics of the sports through visual text mining results, including the connections among football players, football clubs, and football competitions, and it is of great convenience to observe and analyze the developmental tendencies of football. Therefore, in this paper, we constructed a 1000,000-word Chinese corpus in the field of football and proposed a BiLSTM-based model for named entity recognition. The ALBERT-BiLSTM combination model of deep learning is used for entity extraction of football textual data. Based on the BiLSTM model, we introduced ALBERT as a pre-training model to extract character and enhance the generalization ability of word embedding vectors. We then compared the results of two different annotation schemes, BIO and BIOE, and two deep learning models, ALBERT-BiLSTM-CRF and ALBERT BiLSTM. It was verified that the BIOE tagging was superior than BIO, and the ALBERT-BiLSTM model was more suitable for football datasets. The precision, recall, and F-Score of the model were 85.4%, 83.47%, and 84.37%, correspondingly.
- Published
- 2023
- Full Text
- View/download PDF
22. Localization model of traditional Chinese medicine Zang-fu based on ALBERT and Bi-GRU
- Author
-
De-zheng ZHANG, Xin-xin FAN, Yong-hong XIE, and Yan-zhao JIANG
- Subjects
multi-label text classification ,albert ,gru ,localization of zang-fu ,traditional chinese medicine (tcm) ,Mining engineering. Metallurgy ,TN1-997 ,Environmental engineering ,TA170-171 - Abstract
The rapid development of artificial intelligence (AI) has injected new vitality into various industries and provided new ideas for the development of traditional Chinese medicine (TCM). The combination of AI and TCM provides more technical support for TCM auxiliary diagnosis and treatment. In the history of TCM, many methods of syndrome differentiation have been observed, among which the differentiation of Zang-fu organs is one of the important methods. The purpose of this paper is to provide support for the localization of Zang-fu in TCM through AI technology. Localization of Zang-fu organs is a method of determining the location of lesions in such organs and is an important stage in the differentiation of Zang-fu organs in TCM. In this paper, the localization model of TCM Zang-fu organs through the neural network model was established. Through the input of symptom text information, the corresponding Zang-fu label for a lesion could be output to provide support for the realization of Zang-fu syndrome differentiation in TCM-assisted diagnosis and treatment. In this paper, the localization of Zang-fu organs was abstracted as multi-label text classification in natural language processing. Using the medical record data of TCM, a Zang-fu localization model based on pretraining models a lite BERT (ALBERT) and bidirectional gated recurrent unit (Bi-GRU) was proposed. Comparison and ablation experiments finally show that the proposed method is more accurate than multilayer perceptron and the decision tree. Moreover, using an ALBERT pretraining model for text representation effectively improves the accuracy of the localization model. In terms of model parameters, the ALBERT pretraining model greatly reduces the number of model parameters compared with the BERT model and effectively reduces the model size. Finally, the F1-value of the Zang-fu localization model proposed in this paper reaches 0.8013 on the test set, which provided certain support for the TCM auxiliary diagnosis and treatment.
- Published
- 2021
- Full Text
- View/download PDF
23. Multi-Grained Attention Representation With ALBERT for Aspect-Level Sentiment Classification
- Author
-
Yuezhe Chen, Lingyun Kong, Yang Wang, and Dezhi Kong
- Subjects
Aspect-level sentiment classification ,ALBERT ,natural language processing ,deep learning ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Aspect-level sentiment classification aims to solve the problem, which is to judge the sentiment tendency of each aspect in a sentence with multiple aspects. Previous works mainly employed Long Short-Term Memory (LSTM) and Attention mechanisms to fuse information between aspects and sentences, or to improve large language models such as BERT to adapt aspect-level sentiment classification tasks. The former methods either did not integrate the interactive information of related aspects and sentences, or ignored the feature extraction of sentences. This paper proposes a novel multi-grained attention representation with ALBERT (MGAR-ALBERT). It can learn the representation that contains the relevant information of the sentence and the aspect, while integrating it into the process of sentence modeling with multi granularity, and finally get a comprehensive sentence representation. In Masked LM (MLM) task, in order to avoid the influence of aspect words being masked in the initial stage of the pre-training, the noise linear cosine decay is introduced into $n-gram$ . We implemented a series of comparative experiments to verify the effectiveness of the method. The experimental results show that our model can achieve excellent results on Restaurant dataset with numerous number of parameters reduced, and it is not inferior to other models on Laptop dataset.
- Published
- 2021
- Full Text
- View/download PDF
24. ALBERTC-CNN Based Aspect Level Sentiment Analysis
- Author
-
Xingxin Ye, Yang Xu, and Mengshi Luo
- Subjects
ALBERT ,aspect level ,ConvNets ,sentiment analysis ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
In order to solve the problem that most aspect level sentiment analysis networks cannot extract the global and local information of the context at the same time. This study proposes an aspect level sentiment analysis model named Combining with A Lite Bidirection Encoder Represention from TransConvs and ConvNets(ALBERTC-CNN). First, the global sentence information and local emotion information in a text are extracted by the improved ALBERTC network, and the input aspect level text is represented by a word vector. Then, the feature vector is mapped to the emotion classification number by a linear function and a softmax function. Finally, the aspect level sentiment analysis results are obtained. The proposed model is tested on two datasets of the SemEval-2014 open task, the laptop and restaurant datasets, and compared with the traditional networks. The results show that compared with the traditional network, the classification accuracy of the proposed model is improved by approximately 4% and 5% on the two sets, whereas the F1 value is improved by approximately 4% and 8%. Additionally, compared with the original ALBERT network, the accuracy is improved by approximately 2%, and the F1 value is improved by approximately 1%.
- Published
- 2021
- Full Text
- View/download PDF
25. A Lite Romanian BERT: ALR-BERT.
- Author
-
Nicolae, Dragoş Constantin, Yadav, Rohan Kumar, and Tufiş, Dan
- Subjects
NATURAL language processing ,ROMANIAN language ,ROMANIANS - Abstract
Large-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP). There has been huge interest in further increasing the model's size in order to outperform the best previously obtained performances. However, at some point, increasing the model's parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU. In addition to this, such models are mostly available in English or a shared multilingual structure. Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we called "A Lite Romanian BERT (ALR-BERT)". Based on comprehensive empirical results, ALR-BERT produces models that scale far better than the original Romanian BERT. Alongside presenting the performance on downstream tasks, we detail the analysis of the training process and its parameters. We also intend to distribute our code and model as an open source together with the downstream task. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. Microblog Text Emotion Classification Algorithm Based on TCN-BiGRU and Dual Attention
- Author
-
Yao Qin, Yiping Shi, Xinze Hao, and Jin Liu
- Subjects
microblog text ,sentiment classification ,dual attention ,fusion features ,TCN-BiGRU ,ALBERT ,Information technology ,T58.5-58.64 - Abstract
Microblog is an important platform for mining public opinion, and it is of great value to conduct emotional analysis of microblog texts during the current epidemic. Aiming at the problem that most current emotional classification methods cannot effectively extract deep text features, and that traditional word vectors cannot dynamically obtain the semantics of words according to their context, which leads to classification bias, this research put forward a microblog text emotion classification algorithm based on TCN-BiGRU and dual attention (TCN-BiGRU-DATT). First, the vector representation of the text was obtained using ALBERT. Second, the TCN and BiGRU networks were used to extract the emotional information contained in the text through dual pathway feature extraction, to efficiently obtain the deep semantic features of the text. Then, the dual attention mechanism was introduced to allocate the global weight of the key information in the semantic features, and the emotional features were spliced and fused. Finally, the Softmax classifier was applied for emotion classification. The findings of a comparative experiment on a set of microblog text comments collected throughout the pandemic revealed that the accuracy, recall, and F1 value of the emotion classification method proposed in this paper reached 92.33%, 91.78%, and 91.52%, respectively, which was a significant improvement compared with other models.
- Published
- 2023
- Full Text
- View/download PDF
27. Named Entity Recognition Model Based on Feature Fusion
- Author
-
Zhen Sun and Xinfu Li
- Subjects
named entity recognition ,ALBERT ,vector fusion ,multiple head attention ,Information technology ,T58.5-58.64 - Abstract
Named entity recognition can deeply explore semantic features and enhance the ability of vector representation of text data. This paper proposes a named entity recognition method based on multi-head attention to aim at the problem of fuzzy lexical boundary in Chinese named entity recognition. Firstly, Word2vec is used to extract word vectors, HMM is used to extract boundary vectors, ALBERT is used to extract character vectors, the Feedforward-attention mechanism is used to fuse the three vectors, and then the fused vectors representation is used to remove features by BiLSTM. Then multi-head attention is used to mine the potential word information in the text features. Finally, the text label classification results are output after the conditional random field screening. Through the verification of WeiboNER, MSRA, and CLUENER2020 datasets, the results show that the proposed algorithm can effectively improve the performance of named entity recognition.
- Published
- 2023
- Full Text
- View/download PDF
28. Fine tuning of Language models for automation of Humor Detection.
- Author
-
PALIVELA, HEMANT and CHAUHAN, TAVISHEE
- Subjects
WIT & humor ,LANGUAGE & languages - Abstract
In this paper, we propose a method that showcases a novel approach for humor identification using ALBERT and automation of best fit loss function identification and also the Optimiser identification. We have used two configurations of ALBERT, Albert-base and Albert-large. Using different hyper-parameters, we compare their results to obtain the best results for the binary classification problem of detecting texts that are humorous and those that are not humorous. We also determine the best optimizer and loss function that can be used to achieve state-of-the-art performance. The proposed system has been evaluated using metrics that include accuracy, precision, recall, F1-score, and the amount of time required. Among multiple loss functions, Adafactor on Albert-base model have shown promising results with 99% of accuracy. Paper also talks about comparison of the proposed approach with other language models like BERT, ROBERTa to see a steep decline of 1/3rd in the time taken to train the model on 160K sentences. [ABSTRACT FROM AUTHOR]
- Published
- 2021
29. Judging for the world : philosophies of existence, narrative imagination, and the ambiguity of political judgement
- Author
-
Mrovlje, Maša and Hayden, Patrick
- Subjects
320.01 ,Philosophies of existence ,Political judgement ,Narrative imagination ,JA71.M8 ,Political science--Philosophy ,Judgment ,Existentialism--Political aspects ,Sartre ,Jean-Paul ,1905-1980 ,Beauvoir ,Simone de ,1908-1986 ,Camus ,Albert ,1913-1960 ,Arendt ,Hannah ,1906-1975 - Abstract
The thesis inquires into the theme of political judgement and aims to rethink it from the perspective of twentieth-century philosophies of existence. It seeks to take up the contemporary challenge of political judgement that remains inadequately addressed within recent theorizing: how, given the modern breakdown of metaphysical absolutes, to reinvigorate the human capacity for political judgement as a practical activity able to confront the ambiguous, plural and complex character of our postfoundational world. Against this background, the thesis aspires to reclaim the distinctly historical orientation of twentieth-century existentialism, in particular the work of Jean-Paul Sartre, Simone de Beauvoir, Albert Camus and Hannah Arendt. It draws on their aesthetic sensibility to resuscitate the human judging ability in its worldly ambiguity and point towards an account of political judgement capable of facing up to the challenges of our plural and uncertain political reality. Retrieving their vigilant assumption of the situated, worldly condition of human political existence and the attendant perplexity of judging politically, the aim of the thesis is to suggest how the existentialists' insights can be brought to bear on contemporary problematics of political judgement that seem to elude the grasp of abstract standards and predetermined yardsticks.
- Published
- 2015
- Full Text
- View/download PDF
30. Medical QA Oriented Multi-Task Learning Model for Question Intent Classification and Named Entity Recognition
- Author
-
Turdi Tohti, Mamatjan Abdurxit, and Askar Hamdulla
- Subjects
multi-task learning ,named entity recognition ,intent classification ,ALBERT ,deep learning ,Information technology ,T58.5-58.64 - Abstract
Intent classification and named entity recognition of medical questions are two key subtasks of the natural language understanding module in the question answering system. Most existing methods usually treat medical queries intent classification and named entity recognition as two separate tasks, ignoring the close relationship between the two tasks. In order to optimize the effect of medical queries intent classification and named entity recognition tasks, a multi-task learning model based on ALBERT-BILSTM is proposed for intent classification and named entity recognition of Chinese online medical questions. The multi-task learning model in this paper makes use of encoder parameter sharing, which enables the model’s underlying network to take into account both named entity recognition and intent classification features. The model learns the shared information between the two tasks while maintaining its unique characteristics during the decoding phase. The ALBERT pre-training language model is used to obtain word vectors containing semantic information and the bidirectional LSTM network is used for training. A comparative experiment of different models was conducted on Chinese medical questions dataset. Experimental results show that the proposed multi-task learning method outperforms the benchmark method in terms of precision, recall and F1 value. Compared with the single-task model, the generalization ability of the model has been improved.
- Published
- 2022
- Full Text
- View/download PDF
31. A Lite Romanian BERT: ALR-BERT
- Author
-
Dragoş Constantin Nicolae, Rohan Kumar Yadav, and Dan Tufiş
- Subjects
BERT ,transformers ,ALBERT ,NLP ,Romanian ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Large-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP). There has been huge interest in further increasing the model’s size in order to outperform the best previously obtained performances. However, at some point, increasing the model’s parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU. In addition to this, such models are mostly available in English or a shared multilingual structure. Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we called “A Lite Romanian BERT (ALR-BERT)”. Based on comprehensive empirical results, ALR-BERT produces models that scale far better than the original Romanian BERT. Alongside presenting the performance on downstream tasks, we detail the analysis of the training process and its parameters. We also intend to distribute our code and model as an open source together with the downstream task.
- Published
- 2022
- Full Text
- View/download PDF
32. Vom katholischen Deutschordensgebiet zum protestantischen Herzogtum Preußen | From the Catholic Domains of the Teutonic Order to the Protestant Duchy of Prussia
- Author
-
Friedrich Johannsen and Jens Riechmann
- Subjects
Teutonic Order ,Prussia ,Reformation ,Albert ,Martin Luther ,Andreas Osiander ,History (General) and history of Europe - Abstract
The secularisation of the domains of the Teutonic Order in Prussia led to the establishment of the first Lutheran territorial church in the world. This fact is almost forgotten today, and this is evident even in specialised literature on the Reformation. The article outlines the introduction of the Reformation in Prussia, considering it as an example of its smooth and successful entrenchment. In order to show this, the late stage of the rule of the Teutonic Order is defined, showing that fundamental reform was triggered by a multi-layered crisis characteristic of the Order’s domains in Prussia. The article shows that, in coordination with Martin Luther and Philipp Melanchton, and assisted by his bishops, after becoming the first Duke of Prussia in 1525, Albert, the Grand Master of the Teutonic Order, implemented reforms in his domains that resembled the main problems raised by the Reformation in an almost exemplary way. But at the same time, it shows that the introduction of the Reformation in Prussia was not a unidirectional process, for Duke Albert supported Andreas Osiander’s ideas for some time, before he gradually entered the ranks of the confessors of Augsburg.
- Published
- 2017
- Full Text
- View/download PDF
33. Comparing Different Transformer Models’ Performance for Identifying Toxic Language Online
- Author
-
Sundelin, Carl and Sundelin, Carl
- Abstract
There is a growing use of the internet and alongside that, there has been an increase in the use of toxic language towards other people that can be harmful to those that it targets. The usefulness of artificial intelligence has exploded in recent years with the development of natural language processing, especially with the use of transformers. One of the first ones was BERT, and that has spawned many variations including ones that aim to be more lightweight than the original ones. The goal of this project was to train three different kinds of transformer models, RoBERTa, ALBERT, and DistilBERT, and find out which one was best at identifying toxic language online. The models were trained on a handful of existing datasets that had labelled data as abusive, hateful, harassing, and other kinds of toxic language. These datasets were combined to create a dataset that was used to train and test all of the models. When tested against data collected in the datasets, there was very little difference in the overall performance of the models. The biggest difference was how long it took to train them with ALBERT taking approximately 2 hours, RoBERTa, around 1 hour and DistilBERT just over half an hour. To understand how well the models worked in a real-world scenario, the models were evaluated by labelling text as toxic or non-toxic on three different subreddits. Here, a larger difference in performance showed up. DistilBERT labelled significantly fewer instances as toxic compared to the other models. A sample of the classified data was manually annotated, and it showed that the RoBERTa and DistilBERT models still performed similarly to each other. A second evaluation was done on the data from Reddit and a threshold of 80% certainty was required for the classification to be considered toxic. This led to an average of 28% of instances being classified as toxic by RoBERTa, whereas ALBERT and DistilBERT classified an average of 14% and 11% as toxic respectively. When the results f
- Published
- 2023
34. Enhancing Context-Based Question-Answering using Attention Mechanism
- Author
-
Amil M Shaji and Rony Tom
- Subjects
SQUAD Transformers ,Attention Mechanism ,ALBERT ,Chatbot ,Pytorch - Abstract
— This research utilizes an attention mechanism to enhance the efficacy of context-driven question-answering (QA) in chatbot technology. To produce more accurate responses to user queries, we used the ALBERT-base-v2-squad_v2 model that has been improved on the Stanford Question Answering Dataset (SQuAD) v2. In order to offer pertinent replies, the model needs to concentrate on key words or phrases in the given context, which is assisted by the attention mechanism. Our tests revealed that the suggested strategy performed better than conventional QA models without an attention mechanism and had a higher accuracy rate. These findings can be extended to numerous chatbot applications and show how attention mechanisms improve context-based quality assurance in chatbots.
- Published
- 2023
- Full Text
- View/download PDF
35. ALBERT over Match-LSTM Network for Intelligent Questions Classification in Chinese
- Author
-
Xiaomin Wang, Haoriqin Wang, Guocheng Zhao, Zhichao Liu, and Huarui Wu
- Subjects
ALBERT ,match-LSTM ,natural language processing ,classification ,NQuAD ,Agriculture - Abstract
This paper introduces a series of experiments with an ALBERT over match-LSTM network on the top of pre-trained word vectors, for accurate classification of intelligent question answering and thus the guarantee of precise information service. To improve the performance of data classification, a short text classification method based on an ALBERT and match-LSTM model was proposed to overcome the limitations of the classification process, such as few vocabularies, sparse features, large amount of data, lots of noise and poor normalization. In the model, Jieba word segmentation tools and agricultural dictionary were selected to text segmentation, GloVe algorithm was then adopted to expand the text characteristic and weighted word vector according to the text of key vector, bi-directional gated recurrent unit was applied to catch the context feature information and multi-convolutional neural networks were finally established to gain local multidimensional characteristics of text. Batch normalization, Dropout, Global Average Pooling and Global Max Pooling were utilized to solve overfitting problem. The results showed that the model could classify questions accurately, with a precision of 96.8%. Compared with other classification models, such as multi-SVM model and CNN model, ALBERT+match-LSTM had obvious advantages in classification performance in intelligent Agri-tech information service.
- Published
- 2021
- Full Text
- View/download PDF
36. Demystifying the Negative: René Girard's Critique of the "Humanization of Nothingness".
- Author
-
Wilmes, Andreas
- Subjects
- *
MODERN philosophy , *MODERN history , *HUMAN beings - Abstract
This paper will address René Girard's critique of the "humanization of nothingness" in modern Western philosophy. I will first explain how the "desire for death" is related to a phenomenon that Girard refers to as "obstacle addiction." Second, I will point out how mankind's desire for death and illusory will to self-divinization gradually tend to converge within the history of modern Western humanism. In particular, I will show how this convergence between selfdestruction and self-divinization gradually takes shape through the evolution of the concept of "the negative" from Hegel to Kojève, Sartre and Camus. Finally, we shall come to see that in Girard's view "the negative" has tended to become an ever-preoccupying and unacknowledged symptom of mankind's addiction to "model/obstacles" of desire. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
37. Text Mining of Stocktwits Data for Predicting Stock Prices
- Author
-
Mukul Jaggi, Priyanka Mandal, Shreya Narang, Usman Naseem, and Matloob Khushi
- Subjects
BERT ,FinBERT ,ALBERT ,NLP ,StockTwits ,FinALBERT ,Technology ,Applied mathematics. Quantitative methods ,T57-57.97 - Abstract
Stock price prediction can be made more efficient by considering the price fluctuations and understanding people’s sentiments. A limited number of models understand financial jargon or have labelled datasets concerning stock price change. To overcome this challenge, we introduced FinALBERT, an ALBERT based model trained to handle financial domain text classification tasks by labelling Stocktwits text data based on stock price change. We collected Stocktwits data for over ten years for 25 different companies, including the major five FAANG (Facebook, Amazon, Apple, Netflix, Google). These datasets were labelled with three labelling techniques based on stock price changes. Our proposed model FinALBERT is fine-tuned with these labels to achieve optimal results. We experimented with the labelled dataset by training it on traditional machine learning, BERT, and FinBERT models, which helped us understand how these labels behaved with different model architectures. Our labelling method’s competitive advantage is that it can help analyse the historical data effectively, and the mathematical function can be easily customised to predict stock movement.
- Published
- 2021
- Full Text
- View/download PDF
38. Geometric current response in Chern systems and topological delocalization in Floquet class AIII systems
- Author
-
Brown, Albert
- Subjects
Condensed matter physics ,Albert ,Brown ,Chern ,Floquet ,Roy ,Topology - Abstract
Topological phases are phases of matter that are characterized by discrete quantities known as topological invariants. This thesis explores two such phases, the static Chern insulator phase (symmetry class A) in two dimensions and the dynamical Floquet chiral phase (symmetry class AIII) in one dimension.A Chern insulator is a gapped single particle system on a lattice with a non-zero first Chern number for some of its energy-momentum bands. We consider subjecting a Chern insulator to a non-uniform external electric field. The response is band geometric which means it is robust to deformations of the energy bands which do not cause band touchings. We find a connection between this response and previous work on band geometric quantities.A Floquet insulator is a unitary time evolved system defined by a time periodic Hamiltonian. We first describe an existing model of a 1D chain with 2D onsite Hilbert space and chiral symmetry. Then we introduce a disordered model of the system and look at how its topological properties are robust to the disorder. We find a power law scaling of $\nu = 2$ for the localization-delocalization transition of the eigenstates of the unitary operator as the drive evolves towards the midway point of its evolution.
- Published
- 2019
39. RoBERTa: language modelling in building Indonesian question-answering systems
- Author
-
Wiwin Suwarningsih, Raka Aditya Pramata, Fadhil Yusuf Rahadika, and Mochamad Havid Albar Purnomo
- Subjects
language modelling ,Indonesian QAS ,ALBERT ,ELECTRA ,Electrical and Electronic Engineering ,RoBERTa - Abstract
This research aimed to evaluate the performance of the A Lite BERT (ALBERT), efficiently learning an encoder that classifies token replacements accurately (ELECTRA) and a robust optimized BERT pretraining approach (RoBERTa) models to support the development of the Indonesian language question and answer system model. The evaluation carried out used Indonesian, Malay and Esperanto. Here, Esperanto was used as a comparison of Indonesian because it is international, which does not belong to any person or country and this then make it neutral. Compared to other foreign languages, the structure and construction of Esperanto is relatively simple. The dataset used was the result of crawling Wikipedia for Indonesian and Open Super-large Crawled ALMAnaCH coRpus (OSCAR) for Esperanto. The size of the token dictionary used in the test used approximately 30,000 sub tokens in both the SentencePiece and byte-level byte pair encoding methods (ByteLevelBPE). The test was carried out with the learning rates of 1e-5 and 5e-5 for both languages in accordance with the reference from the bidirectional encoder representations from transformers (BERT) paper. As shown in the final result of this study, the ALBERT and RoBERTa models in Esperanto showed the results of the loss calculation that were not much different. This showed that the RoBERTa model was better to implement an Indonesian question and answer system.
- Published
- 2022
40. Der ungesicherte Goldschatz.
- Author
-
Seyringer, Christian, Loibner, Klaus, Raumauf, Hannes, and Stoll, Judith
- Abstract
Copyright of e & i Elektrotechnik und Informationstechnik is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2018
- Full Text
- View/download PDF
41. TextCNN-based ensemble learning model for Japanese Text Multi-classification.
- Author
-
Chen, Hua, Zhang, Zepeng, Huang, Shiting, Hu, Jiayu, Ni, Wenlong, and Liu, Jianming
- Subjects
- *
CLASSIFICATION - Abstract
In this paper, we aim at improving Japanese text classification using TextCNN-based ensemble learning model. Specifically, we first construct three different sub-classifiers, combining ALBERT, RoBERTa, DistilBERT with TextCNN, respectively; and then explore the effectiveness of ensemble learning model to leverage complementary information from different sub-classifiers for better text classification. We also conduct a series of experiments with the dataset collected from Japanese Wikipedia pages, which was divided into 31 categories. The experimental results show that the proposed approach achieves a good performance. The accuracy, precision, recall and F1 scores reach 0.881 , 0.884 , 0.880 and 0.881 , respectively, which shows that the TextCNN-based ensemble learning model can be used for Japanese Text Multi-Classification effectively. [Display omitted] • Three TextCNN-based sub-classifiers for Japanese text classification are designed. • A Bagging ensemble learning model is proposed to combine three different subclassifiers for multi-label Japanese text classification. • A Japanese dataset for classification is constructed by crawling from Japanese Wikipedia pages. • The TextCNN-based model with bagging ensemble learning performs well for text classification. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Extracting information about arms deals from news articles
- Author
-
Hernqvist, Fredrik and Hernqvist, Fredrik
- Abstract
The Stockholm International Peace Research Institute (SIPRI) maintains the most comprehensive publicly available database on international arms deals. Updating this database requires humans to sift through large amounts of news articles, only some of which contain information relevant to the database. To save time, it would be useful to automate a part of this process. In this thesis project we apply ALBERT, a state of the art Pre-trained Language Model for Natural Language Processing (NLP), to the task of determining if a text contains information about arms transfers and extracting that information. In order to train and evaluate the model we also introduce a new dataset of 600 news articles, where information about arms deals is annotated with lables such as Weapon, Buyer, Seller, etc. We achieve an F1-score of 0.81 on the task of determining if an arms deal is present in a text, and an F1-score of 0.77 on determining if a given part of a text has a specific arms deal-related attribute. This is probably not enough to entirely automate SIPRI’s process, but it demonstrates that the approach is feasible. While this paper focuses specifically on arms deals, the methods used can be generalized to extracting other kinds of information., Stockholm International Peace Research Institute (SIPRI) tillhandahåller den största allmänt tillgängliga databasen med internationella vapenaffärer. För att hålla databasen uppdaterad måste människor sålla igenom stora mängder nyhetsartiklar, varav endast några innehåller information som är relevant för databasen. För att spara tid vore det bra att kunna automatisera en del av den processen. I det här examensarbetet använder vi ALBERT, en maskininlärningsmodell för behandling av naturliga språk (NLP), för att avgöra om en text innehåller information om vapenaffärer och för att extrahera den informationen. För att träna modellen skapar vi också ett dataset med 600 nyhetsartiklar, där information om vapenaffärer finns annoterad med attribut som Vapen, Köpare, Säljare, etc. Vi fick en F1-score på 0.81 på problemet att avgöra om en vapenaffär finns i en text, och en F1-score på 0.77 på problemet att avgöra om en given del av en text har ett specifikt vapenaffärsrelaterat attribut. Resultaten är förmodligen inte bra nog för att helt kunna automatisera SIPRIs process, men de demonstrerar att metoden är lovande. Det här examensarbetet fokuserar specifikt på vapenaffärer, men metoderna kan förmodligen generaliseras för att extrahera andra sorters information.
- Published
- 2022
43. Federated Learning for Natural Language Processing using Transformers
- Author
-
Kjellberg, Gustav and Kjellberg, Gustav
- Abstract
The use of Machine Learning (ML) in business has increased significantly over the past years. Creating high quality and robust models requires a lot of data, which is at times infeasible to obtain. As more people are becoming concerned about their data being misused, data privacy is increasingly strengthened. In 2018, the General Data Protection Regulation (GDPR), was announced within the EU. Models that use either sensitive or personal data to train need to obtain that data in accordance with the regulatory rules, such as GDPR. One other data related issue is that enterprises who wish to collaborate on model building face problems when it requires them to share their private corporate data [36, 38]. In this thesis we will investigate how one might overcome the issue of directly accessing private data when training ML models by employing Federated Learning (FL) [38]. The concept of FL is to allow several silos, i.e. separate parties, to train models with the same objective, using their local data and then with the learned model parameters create a central model. The objective of the central model is to obtain the information learned by the separate models, without ever accessing the raw data itself. This is achieved by averaging the separate models’ weights into the central model. FL thus facilitates opportunities to train a model on large amounts of data from several sources, without the need of having access to the data itself. If one can create a model with this methodology, that is not significantly worse than a model trained on the raw data, then positive effects such as strengthened data privacy, cross-enterprise collaboration and more could be attainable. In this work we have used a financial data set consisting of 25242 equity research reports, provided by Skandinaviska Enskilda Banken (SEB). Each report has a recommendation label, either Buy, Sell or Hold, making this a multi-class classification problem. To evaluate the feasibility of FL we fine-tune the p, Företags nyttjande av maskininlärning har de senaste åren ökat signifikant och för att kunna skapa högkvalitativa modeller krävs stora mängder data, vilket kan vara svårt att insamla. Parallellt med detta så ökar också den allmänna förståelsen för hur användandet av data missbrukas, vilket har lätt till ett ökat behov av starkare datasäkerhet. 2018 så trädde General Data Protection Regulation (GDPR) i kraft inom EU, vilken bland annat ställer krav på hur företag skall hantera persondata. Företag med maskininlärningsmodeller som på något sätt använder känslig eller personlig data behöver således ha fått tillgång till denna data i enlighet med de rådande lagar och regler som omfattar datahanteringen. Ytterligare ett datarelaterat problem är då företag önskar att skapa gemensamma maskininlärningsmodeller som skulle kräva att de delar deras bolagsdata [36, 38]. Denna uppsats kommer att undersöka hur Federerad Inlärning [38] kan användas för att skapa maskinlärningsmodeller som överkommer dessa datasäkerhetsrelaterade problem. Federerad Inlärning är en metod för att på ett decentraliserat vis träna maskininlärningsmodeller. Detta omfattar att låta flera aktörer träna en modell var. Varje enskild aktör tränar respektive modell på deras isolerade data och delar sedan endast modellens parametrar till en central modell. På detta vis kan varje enskild modell bidra till den gemensamma modellen utan att den gemensamma modellen någonsin haft tillgång till den faktiska datan. Givet att en modell, skapad med Federerad Inlärning kan uppnå liknande resultat som en modell tränad på rådata, så finns många positiva fördelar så som ökad datasäkerhet och ökade samarbeten mellan företag. Under arbetet har ett dataset, bestående av 25242 finansiella rapporter tillgängliggjort av Skandinaviska Ensilda Banken (SEB) använts. Varje enskild rapport innefattar en rekommendation, antingen Köp, Sälj eller Håll, vilket innebär att vi utför muliklass-klassificering. Med datan tränas den förtränade T
- Published
- 2022
44. Research on Multi-label Text Classification Method Based on tALBERT-CNN
- Author
-
Liu, Wenfu, Pang, Jianmin, Li, Nan, Zhou, Xin, and Yue, Feng
- Published
- 2021
- Full Text
- View/download PDF
45. What do managers of IPO firms tell investors : evidence from IPO online roadshow in chinese stock market
- Author
-
Sun, Naili and Du, Saijuan
- Subjects
First-day return ,流動性 ,Online roadshows ,オンラインロードショー ,Liquidity ,初期収益率 ,ALBERT ,Question-answers relevance ,問答関連度 - Abstract
中国証券市場において,取引者の多数を占める小口投資家は新規上場企業のオフラインロードショーに参加することができない。そのため,オンラインロードショーは彼らが企業の経営を知るための重要な方法である。本研究は2010年から2018年までに深圳証券取引所の中小企業ボードに新規上場する企業を研究対象とし,企業が開催したオンラインロードショーにおける投資家の質問と経営陣の回答の問答関連度を計量して,さらに,問答関連度と新規上場企業の初期パフォーマンスの関係を検証した。検証結果によると,オンラインロードショーでの質問と回答の関連度が高ければ高いほど,初期収益率と流動性が高くなる。また,長期的には,問答関連度の高い企業は株価が安定しているものの,問答関連度の低い企業は株価が低下しつつある傾向があるということが追加検証で分かる。オンラインロードショーは,投資家と新規上場企業間の情報非対称性を,ある程度緩和でき,企業価値を評価するには大切な役割を果たしている。, Retail investors, who are the majority traders in the Chinese stock market, cannot participate in off-line roadshows of IPO firms. Online roadshows are an important channel for them to learn about the situation of IPO firms. This study takes the IPO firms which are new-listed in Shenzhen Stock Exchange from 2010 to 2018 as the research subject, investigates the relationship between question-answers relevance in online roadshows and the first-day performance of IPO firms. According to the results, a high-level of question-answers relevance leads to a high first-day return and high liquidity. Furthermore, the additional analysis shows that the IPO firms with a high-level of question-answers relevance show stable long-term stock returns, on the contrary, the IPO firms with a low-level of question-answers relevance show the trends to decrease. Online roadshows could mitigate the information asymmetry between the investors and IPO firms, and play an important role in assessing firm values.
- Published
- 2021
46. ELECTROLBERT: Combining Replaced Token Detection and Sentence Order Prediction
- Author
-
Martin Reczko
- Subjects
BioASQ ,Biomedical Question Answering ,ELECTRA ,ALBERT - Abstract
of the conference paper in the Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum Bologna, Italy, September 5th to 8th, 2022. Full paper at http://ceur-ws.org/Vol-3180/paper-24.pdf
- Published
- 2022
- Full Text
- View/download PDF
47. Special and exceptional mock-Lie algebras.
- Author
-
Zusmanovich, Pasha
- Subjects
- *
LIE algebras , *COMMUTATIVE algebra , *JACOBI identity , *TOPOLOGICAL algebras , *MATHEMATICAL analysis - Abstract
We observe several facts and make conjectures about commutative algebras satisfying the Jacobi identity. The central question is which of those algebras admit a faithful representation (i.e., in Lie parlance, satisfy the Ado theorem, or, in Jordan parlance, are special). [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
48. Multi-Grained Attention Representation With ALBERT for Aspect-Level Sentiment Classification
- Author
-
Dezhi Kong, Yuezhe Chen, Lingyun Kong, and Yang Wang
- Subjects
Context model ,General Computer Science ,Computer science ,business.industry ,Sentiment analysis ,Feature extraction ,General Engineering ,deep learning ,ALBERT ,computer.software_genre ,Data modeling ,TK1-9971 ,Aspect-level sentiment classification ,Task analysis ,General Materials Science ,Language model ,Artificial intelligence ,Electrical engineering. Electronics. Nuclear engineering ,natural language processing ,Representation (mathematics) ,business ,computer ,Sentence ,Natural language processing - Abstract
Aspect-level sentiment classification aims to solve the problem, which is to judge the sentiment tendency of each aspect in a sentence with multiple aspects. Previous works mainly employed Long Short-Term Memory (LSTM) and Attention mechanisms to fuse information between aspects and sentences, or to improve large language models such as BERT to adapt aspect-level sentiment classification tasks. The former methods either did not integrate the interactive information of related aspects and sentences, or ignored the feature extraction of sentences. This paper proposes a novel multi-grained attention representation with ALBERT (MGAR-ALBERT). It can learn the representation that contains the relevant information of the sentence and the aspect, while integrating it into the process of sentence modeling with multi granularity, and finally get a comprehensive sentence representation. In Masked LM (MLM) task, in order to avoid the influence of aspect words being masked in the initial stage of the pre-training, the noise linear cosine decay is introduced into $n-gram$ . We implemented a series of comparative experiments to verify the effectiveness of the method. The experimental results show that our model can achieve excellent results on Restaurant dataset with numerous number of parameters reduced, and it is not inferior to other models on Laptop dataset.
- Published
- 2021
49. ALBERTC-CNN Based Aspect Level Sentiment Analysis
- Author
-
Yang Xu, Xingxin Ye, and Mengshi Luo
- Subjects
General Computer Science ,Computer science ,business.industry ,Feature vector ,Emotion classification ,Feature extraction ,Sentiment analysis ,aspect level ,General Engineering ,Context (language use) ,Pattern recognition ,ALBERT ,TK1-9971 ,sentiment analysis ,Softmax function ,General Materials Science ,Artificial intelligence ,Electrical engineering. Electronics. Nuclear engineering ,Electrical and Electronic Engineering ,business ,Encoder ,Word (computer architecture) ,ConvNets - Abstract
In order to solve the problem that most aspect level sentiment analysis networks cannot extract the global and local information of the context at the same time. This study proposes an aspect level sentiment analysis model named Combining with A Lite Bidirection Encoder Represention from TransConvs and ConvNets(ALBERTC-CNN). First, the global sentence information and local emotion information in a text are extracted by the improved ALBERTC network, and the input aspect level text is represented by a word vector. Then, the feature vector is mapped to the emotion classification number by a linear function and a softmax function. Finally, the aspect level sentiment analysis results are obtained. The proposed model is tested on two datasets of the SemEval-2014 open task, the laptop and restaurant datasets, and compared with the traditional networks. The results show that compared with the traditional network, the classification accuracy of the proposed model is improved by approximately 4% and 5% on the two sets, whereas the F1 value is improved by approximately 4% and 8%. Additionally, compared with the original ALBERT network, the accuracy is improved by approximately 2%, and the F1 value is improved by approximately 1%.
- Published
- 2021
50. Interventionism in Statistical Mechanics
- Author
-
Stephen Leeds
- Subjects
interventionism ,Albert ,arrow of time ,Science ,Astrophysics ,QB460-466 ,Physics ,QC1-999 - Abstract
I defend the idea that the fact that no system is entirely isolated (“Interventionism”) can be used to explain the successful use of the microcanonical distribution in statistical mechanics. The argument turns on claims about what is needed for an adequate explanation of this fact: I argue in particular that various competing explanations do not meet reasonable conditions of adequacy, and that the most striking lacuna in Interventionism—its failure to explain the “arrow of time”—is no real defect.
- Published
- 2012
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.