Descriptor: "ALBERT" / Topic: natural language processing - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"ALBERT"' showing total 20 results

Start Over Descriptor "ALBERT" Topic natural language processing

20 results on '"ALBERT"'

1. A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model.

Author: Um, Taehum and Kim, Namhyoung
Subjects: LANGUAGE models, ARTIFICIAL neural networks, TRANSFORMER models, LATENT variables, STOCHASTIC models
Abstract: As an extension of the transformer architecture, the BERT model has introduced a new paradigm for natural language processing, achieving impressive results in various downstream tasks. However, high-performance BERT-based models—such as ELECTRA, ALBERT, and RoBERTa—suffer from limitations such as poor continuous learning capability and insufficient understanding of domain-specific documents. To address these issues, we propose the use of an attention mechanism to combine BERT-based models with neural topic models. Unlike traditional stochastic topic modeling, neural topic modeling employs artificial neural networks to learn topic representations. Furthermore, neural topic models can be integrated with other neural models and trained to identify latent variables in documents, thereby enabling BERT-based models to sufficiently comprehend the contexts of specific fields. We conducted experiments on three datasets—Movie Review Dataset (MRD), 20Newsgroups, and YELP—to evaluate our model's performance. Compared to the vanilla model, the proposed model achieved an accuracy improvement of 1–2% for the ALBERT model in multiclassification tasks across all three datasets, while the ELECTRA model showed an accuracy improvement of less than 1%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. PAL-BERT: An Improved Question Answering Model.

Author: Wenfeng Zheng, Siyu Lu, Zhuohang Cai, Ruiyang Wang, Lei Wang, and Lirong Yin
Subjects: QUESTION answering systems, LANGUAGE models, NATURAL language processing, DEEP learning, TASK performance
Abstract: In the field of natural language processing (NLP), there have been various pre-training language models in recent years, with question answering systems gaining significant attention. However, as algorithms, data, and computing power advance, the issue of increasingly larger models and a growing number of parameters has surfaced. Consequently, model training has become more costly and less efficient. To enhance the efficiency and accuracy of the training process while reducing themodel volume, this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering (QA) system and language model. Firstly, a first-order network pruning method based on the ALBERT model is designed, and the PAL-BERT model is formed. Then, the parameter optimization strategy of the PAL-BERT model is formulated, and the Mish function was used as an activation function instead of ReLU to improve the performance. Finally, after comparison experiments with traditional deep learning models TextCNN and BiLSTM, it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency. Compared with traditional models, PAL-BERT significantly improves the NLP task's performance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. Multilingual Question Answering for Malaysia History with Transformer-based Language Model

Author: Qi Zhi Lim, Chin Poo Lee, Kian Ming Lim, Jing Xiang Ng, Eric Khang Heng Ooi, and Nicole Kai Ning Loh
Subjects: question answering, historical knowledge, natural language processing, debertav3, bert, albert, electra, minilm, roberta., Technology (General), T1-995, Social sciences (General), H1-99
Abstract: In natural language processing (NLP), a Question Answering System (QAS) refers to a system or model that is designed to understand and respond to user queries in natural language. As we navigate through the recent advancements in QAS, it can be observed that there is a paradigm shift of the methods used from traditional machine learning and deep learning approaches towards transformer-based language models. While significant progress has been made, the utilization of these models for historical QAS and the development of QAS for Malay language remain largely unexplored. This research aims to bridge the gaps, focusing on developing a Multilingual QAS for history of Malaysia by utilizing a transformer-based language model. The system development process encompasses various stages, including data collection, knowledge representation, data loading and pre-processing, document indexing and storing, and the establishment of a querying pipeline with the retriever and reader. A dataset with a collection of 100 articles, including web blogs related to the history of Malaysia, has been constructed, serving as the knowledge base for the proposed QAS. A significant aspect of this research is the use of the translated dataset in English instead of the raw dataset in Malay. This decision was made to leverage the effectiveness of well-established retriever and reader models that were trained on English data. Moreover, an evaluation dataset comprising 100 question-answer pairs has been created to evaluate the performance of the models. A comparative analysis of six different transformer-based language models, namely DeBERTaV3, BERT, ALBERT, ELECTRA, MiniLM, and RoBERTa, has been conducted, where the effectiveness of the models was examined through a series of experiments to determine the best reader model for the proposed QAS. The experimental results reveal that the proposed QAS achieved the best performance when employing RoBERTa as the reader model. Finally, the proposed QAS was deployed on Discord and equipped with multilingual support through the incorporation of language detection and translation modules, enabling it to handle queries in both Malay and English. Doi: 10.28991/ESJ-2024-08-02-019 Full Text: PDF
Published: 2024
Full Text: View/download PDF

4. Healthcare Data Sensitivity Assessment Through Biomedical NLP-Driven Classification and Statistical Feature Analysis

Author: Dhawan, Manoj and Purohit, Lalit
Published: 2024
Full Text: View/download PDF

5. A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model

Author: Taehum Um and Namhyoung Kim
Subjects: natural language processing, neural topic model, ELECTRA, ALBERT, multi-classification, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: As an extension of the transformer architecture, the BERT model has introduced a new paradigm for natural language processing, achieving impressive results in various downstream tasks. However, high-performance BERT-based models—such as ELECTRA, ALBERT, and RoBERTa—suffer from limitations such as poor continuous learning capability and insufficient understanding of domain-specific documents. To address these issues, we propose the use of an attention mechanism to combine BERT-based models with neural topic models. Unlike traditional stochastic topic modeling, neural topic modeling employs artificial neural networks to learn topic representations. Furthermore, neural topic models can be integrated with other neural models and trained to identify latent variables in documents, thereby enabling BERT-based models to sufficiently comprehend the contexts of specific fields. We conducted experiments on three datasets—Movie Review Dataset (MRD), 20Newsgroups, and YELP—to evaluate our model’s performance. Compared to the vanilla model, the proposed model achieved an accuracy improvement of 1–2% for the ALBERT model in multiclassification tasks across all three datasets, while the ELECTRA model showed an accuracy improvement of less than 1%.
Published: 2024
Full Text: View/download PDF

6. Method for Extracting Cases Relevant to Social Issues from Web Articles to Facilitate Public Debates

Author: Kamiya, Akira, Shiramatsu, Shun, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Yang, Xin-She, editor, Sherratt, Simon, editor, Dey, Nilanjan, editor, and Joshi, Amit, editor
Published: 2022
Full Text: View/download PDF

7. Queries related to COVID-19: a more effective retrieval through finetuned ALBERT with BM25L question answering system

Author: Godavarthi, Deepthi and A., Mary Sowjanya
Published: 2022
Full Text: View/download PDF

8. Medical QA Oriented Multi-Task Learning Model for Question Intent Classification and Named Entity Recognition.

Author: Tohti, Turdi, Abdurxit, Mamatjan, and Hamdulla, Askar
Subjects: *NATURAL language processing, *QUESTION answering systems, *NATURAL languages
Abstract: Intent classification and named entity recognition of medical questions are two key subtasks of the natural language understanding module in the question answering system. Most existing methods usually treat medical queries intent classification and named entity recognition as two separate tasks, ignoring the close relationship between the two tasks. In order to optimize the effect of medical queries intent classification and named entity recognition tasks, a multi-task learning model based on ALBERT-BILSTM is proposed for intent classification and named entity recognition of Chinese online medical questions. The multi-task learning model in this paper makes use of encoder parameter sharing, which enables the model's underlying network to take into account both named entity recognition and intent classification features. The model learns the shared information between the two tasks while maintaining its unique characteristics during the decoding phase. The ALBERT pre-training language model is used to obtain word vectors containing semantic information and the bidirectional LSTM network is used for training. A comparative experiment of different models was conducted on Chinese medical questions dataset. Experimental results show that the proposed multi-task learning method outperforms the benchmark method in terms of precision, recall and F1 value. Compared with the single-task model, the generalization ability of the model has been improved. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

9. Comparative Analysis of Various Language Models on Sentiment Analysis for Retail

Author: Sanjana, Revankar, Tandon, Chahat, Bongale, Pratiksha Jayesh, Arpita, T. M., Palivela, Hemant, Nirmala, C. R., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Tiwari, Aruna, editor, Ahuja, Kapil, editor, Yadav, Anupam, editor, Bansal, Jagdish Chand, editor, Deep, Kusum, editor, and Nagar, Atulya K., editor
Published: 2021
Full Text: View/download PDF

10. Multi-Grained Attention Representation With ALBERT for Aspect-Level Sentiment Classification

Author: Yuezhe Chen, Lingyun Kong, Yang Wang, and Dezhi Kong
Subjects: Aspect-level sentiment classification, ALBERT, natural language processing, deep learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Aspect-level sentiment classification aims to solve the problem, which is to judge the sentiment tendency of each aspect in a sentence with multiple aspects. Previous works mainly employed Long Short-Term Memory (LSTM) and Attention mechanisms to fuse information between aspects and sentences, or to improve large language models such as BERT to adapt aspect-level sentiment classification tasks. The former methods either did not integrate the interactive information of related aspects and sentences, or ignored the feature extraction of sentences. This paper proposes a novel multi-grained attention representation with ALBERT (MGAR-ALBERT). It can learn the representation that contains the relevant information of the sentence and the aspect, while integrating it into the process of sentence modeling with multi granularity, and finally get a comprehensive sentence representation. In Masked LM (MLM) task, in order to avoid the influence of aspect words being masked in the initial stage of the pre-training, the noise linear cosine decay is introduced into $n-gram$ . We implemented a series of comparative experiments to verify the effectiveness of the method. The experimental results show that our model can achieve excellent results on Restaurant dataset with numerous number of parameters reduced, and it is not inferior to other models on Laptop dataset.
Published: 2021
Full Text: View/download PDF

11. A Lite Romanian BERT: ALR-BERT.

Author: Nicolae, Dragoş Constantin, Yadav, Rohan Kumar, and Tufiş, Dan
Subjects: NATURAL language processing, ROMANIAN language, ROMANIANS
Abstract: Large-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP). There has been huge interest in further increasing the model's size in order to outperform the best previously obtained performances. However, at some point, increasing the model's parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU. In addition to this, such models are mostly available in English or a shared multilingual structure. Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we called "A Lite Romanian BERT (ALR-BERT)". Based on comprehensive empirical results, ALR-BERT produces models that scale far better than the original Romanian BERT. Alongside presenting the performance on downstream tasks, we detail the analysis of the training process and its parameters. We also intend to distribute our code and model as an open source together with the downstream task. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

12. Comparing the accuracy of ANN with transformer models for sentiment analysis of tweets related to COVID-19 Pfizer vaccines.

Author: Wu, Xuanyi, Wang, Bingkun, and Li, Wenling
Subjects: *TRANSFORMER models, *ARTIFICIAL neural networks, *MICROBLOGS, *SENTIMENT analysis, *NATURAL language processing, *CONVOLUTIONAL neural networks
Abstract: The current study underscores the critical importance of understanding public sentiment towards vaccination efforts, particularly to achieve the widespread vaccine uptake necessary for public health security, especially in light of pandemics like COVID-19. Despite the clear critical role vaccinations play in mitigating pandemic spread, vaccine hesitancy persists as a formidable challenge. This research focuses on analyzing sentiments related to the Pfizer-BioNTech COVID-19 vaccine using Natural Language Processing (NLP) techniques, offering a comparative assessment of Artificial Neural Network (ANN) frameworks and transformer-based models. The study employs the Valence Aware Dictionary and sEntiment Reasoner (VADER) for sentiment quantification and TensorFlow for text vectorization. Within the ANN domain, both Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) models are investigated, alongside the ALBERT and Distilbert models as a representative of transformer-based architectures. The empirical analysis reveals a distinct advantage of transformer-based models over ANN frameworks in accuracy, with the ALBERT model exhibiting exceptional skill in classifying sentiments of tweets concerning the COVID-19 vaccine by Pfizer. The study meticulously employs Receiver Operating Characteristic (ROC) curve analysis to rank the performance of the evaluated models in sentiment classification, establishing ALBERT as the foremost model, followed in order by LSTM-CNN, Distilbert, CNN, and LSTM models. The ALBERT model demonstrates outstanding performance across critical evaluation metrics. The findings advocate for the strategic use of advanced NLP technologies in public health initiatives to better understand and respond to public attitudes towards vaccination, ultimately contributing to improved health outcomes and pandemic preparedness. • Developing a model for sentiment analysis of tweets related to COVID-19 Pfizer Vaccines • The VADER method was used to assign sentiment score to the data. • The main algorithms used in defining the architecture of ANN-based models included CNN and LSTM algorithms. • The Albert algorithm was also used as a transformer-based model. • The results showed that the Albert algorithm is more accurate than others. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. ALBERT over Match-LSTM Network for Intelligent Questions Classification in Chinese

Author: Xiaomin Wang, Haoriqin Wang, Guocheng Zhao, Zhichao Liu, and Huarui Wu
Subjects: ALBERT, match-LSTM, natural language processing, classification, NQuAD, Agriculture
Abstract: This paper introduces a series of experiments with an ALBERT over match-LSTM network on the top of pre-trained word vectors, for accurate classification of intelligent question answering and thus the guarantee of precise information service. To improve the performance of data classification, a short text classification method based on an ALBERT and match-LSTM model was proposed to overcome the limitations of the classification process, such as few vocabularies, sparse features, large amount of data, lots of noise and poor normalization. In the model, Jieba word segmentation tools and agricultural dictionary were selected to text segmentation, GloVe algorithm was then adopted to expand the text characteristic and weighted word vector according to the text of key vector, bi-directional gated recurrent unit was applied to catch the context feature information and multi-convolutional neural networks were finally established to gain local multidimensional characteristics of text. Batch normalization, Dropout, Global Average Pooling and Global Max Pooling were utilized to solve overfitting problem. The results showed that the model could classify questions accurately, with a precision of 96.8%. Compared with other classification models, such as multi-SVM model and CNN model, ALBERT+match-LSTM had obvious advantages in classification performance in intelligent Agri-tech information service.
Published: 2021
Full Text: View/download PDF

14. Comparing Different Transformer Models’ Performance for Identifying Toxic Language Online

Author: Sundelin, Carl
Subjects: Machine Learning, DistilBERT, Datavetenskap (datalogi), Toxic Language Identification, Artificial Intelligence, Computer Sciences, Transformers, ALBERT, RoBERTa, Social Media, Natural Language Processing
Abstract: There is a growing use of the internet and alongside that, there has been an increase in the use of toxic language towards other people that can be harmful to those that it targets. The usefulness of artificial intelligence has exploded in recent years with the development of natural language processing, especially with the use of transformers. One of the first ones was BERT, and that has spawned many variations including ones that aim to be more lightweight than the original ones. The goal of this project was to train three different kinds of transformer models, RoBERTa, ALBERT, and DistilBERT, and find out which one was best at identifying toxic language online. The models were trained on a handful of existing datasets that had labelled data as abusive, hateful, harassing, and other kinds of toxic language. These datasets were combined to create a dataset that was used to train and test all of the models. When tested against data collected in the datasets, there was very little difference in the overall performance of the models. The biggest difference was how long it took to train them with ALBERT taking approximately 2 hours, RoBERTa, around 1 hour and DistilBERT just over half an hour. To understand how well the models worked in a real-world scenario, the models were evaluated by labelling text as toxic or non-toxic on three different subreddits. Here, a larger difference in performance showed up. DistilBERT labelled significantly fewer instances as toxic compared to the other models. A sample of the classified data was manually annotated, and it showed that the RoBERTa and DistilBERT models still performed similarly to each other. A second evaluation was done on the data from Reddit and a threshold of 80% certainty was required for the classification to be considered toxic. This led to an average of 28% of instances being classified as toxic by RoBERTa, whereas ALBERT and DistilBERT classified an average of 14% and 11% as toxic respectively. When the results from the RoBERTa and DistilBERT models were manually annotated, a significant improvement could be seen in the performance of the models. This led to the conclusion that the DistilBERT model was the most suitable model for training and classifying toxic language of the lightweight models tested in this work.
Published: 2023

15. Multi-Grained Attention Representation With ALBERT for Aspect-Level Sentiment Classification

Author: Dezhi Kong, Yuezhe Chen, Lingyun Kong, and Yang Wang
Subjects: Context model, General Computer Science, Computer science, business.industry, Sentiment analysis, Feature extraction, General Engineering, deep learning, ALBERT, computer.software_genre, Data modeling, TK1-9971, Aspect-level sentiment classification, Task analysis, General Materials Science, Language model, Artificial intelligence, Electrical engineering. Electronics. Nuclear engineering, natural language processing, Representation (mathematics), business, computer, Sentence, Natural language processing
Abstract: Aspect-level sentiment classification aims to solve the problem, which is to judge the sentiment tendency of each aspect in a sentence with multiple aspects. Previous works mainly employed Long Short-Term Memory (LSTM) and Attention mechanisms to fuse information between aspects and sentences, or to improve large language models such as BERT to adapt aspect-level sentiment classification tasks. The former methods either did not integrate the interactive information of related aspects and sentences, or ignored the feature extraction of sentences. This paper proposes a novel multi-grained attention representation with ALBERT (MGAR-ALBERT). It can learn the representation that contains the relevant information of the sentence and the aspect, while integrating it into the process of sentence modeling with multi granularity, and finally get a comprehensive sentence representation. In Masked LM (MLM) task, in order to avoid the influence of aspect words being masked in the initial stage of the pre-training, the noise linear cosine decay is introduced into $n-gram$ . We implemented a series of comparative experiments to verify the effectiveness of the method. The experimental results show that our model can achieve excellent results on Restaurant dataset with numerous number of parameters reduced, and it is not inferior to other models on Laptop dataset.
Published: 2021

16. Extrahering av information om vapenaffärer från nyhetsartiklar

Author: Hernqvist, Fredrik
Subjects: Machine Learning, Arms Transfers, Deep Learning, Datavetenskap (datalogi), maskininlärning, Behandling av naturliga språk, Computer Sciences, vapenaffärer, ALBERT, Information Extraction, djupinlärning, Natural Language Processing, BERT
Abstract: The Stockholm International Peace Research Institute (SIPRI) maintains the most comprehensive publicly available database on international arms deals. Updating this database requires humans to sift through large amounts of news articles, only some of which contain information relevant to the database. To save time, it would be useful to automate a part of this process. In this thesis project we apply ALBERT, a state of the art Pre-trained Language Model for Natural Language Processing (NLP), to the task of determining if a text contains information about arms transfers and extracting that information. In order to train and evaluate the model we also introduce a new dataset of 600 news articles, where information about arms deals is annotated with lables such as Weapon, Buyer, Seller, etc. We achieve an F1-score of 0.81 on the task of determining if an arms deal is present in a text, and an F1-score of 0.77 on determining if a given part of a text has a specific arms deal-related attribute. This is probably not enough to entirely automate SIPRI’s process, but it demonstrates that the approach is feasible. While this paper focuses specifically on arms deals, the methods used can be generalized to extracting other kinds of information. Stockholm International Peace Research Institute (SIPRI) tillhandahåller den största allmänt tillgängliga databasen med internationella vapenaffärer. För att hålla databasen uppdaterad måste människor sålla igenom stora mängder nyhetsartiklar, varav endast några innehåller information som är relevant för databasen. För att spara tid vore det bra att kunna automatisera en del av den processen. I det här examensarbetet använder vi ALBERT, en maskininlärningsmodell för behandling av naturliga språk (NLP), för att avgöra om en text innehåller information om vapenaffärer och för att extrahera den informationen. För att träna modellen skapar vi också ett dataset med 600 nyhetsartiklar, där information om vapenaffärer finns annoterad med attribut som Vapen, Köpare, Säljare, etc. Vi fick en F1-score på 0.81 på problemet att avgöra om en vapenaffär finns i en text, och en F1-score på 0.77 på problemet att avgöra om en given del av en text har ett specifikt vapenaffärsrelaterat attribut. Resultaten är förmodligen inte bra nog för att helt kunna automatisera SIPRIs process, men de demonstrerar att metoden är lovande. Det här examensarbetet fokuserar specifikt på vapenaffärer, men metoderna kan förmodligen generaliseras för att extrahera andra sorters information.
Published: 2022

17. Evaluering av Federerad Inlärning tillämpad på Transformers för klassificering av analytikerrapporter

Author: Kjellberg, Gustav
Subjects: Maskininlärning, Computer and Information Sciences, Distributed Machine Learning, Språkteknologi, ALBERT, Data- och informationsvetenskap, Dataintegritet, Data Privacy, Federerad inlärning, Machine Learning, Distribuerad Maskininlärning, Transformers, Federated Learning, Natural Language Processing, BERT
Abstract: The use of Machine Learning (ML) in business has increased significantly over the past years. Creating high quality and robust models requires a lot of data, which is at times infeasible to obtain. As more people are becoming concerned about their data being misused, data privacy is increasingly strengthened. In 2018, the General Data Protection Regulation (GDPR), was announced within the EU. Models that use either sensitive or personal data to train need to obtain that data in accordance with the regulatory rules, such as GDPR. One other data related issue is that enterprises who wish to collaborate on model building face problems when it requires them to share their private corporate data [36, 38]. In this thesis we will investigate how one might overcome the issue of directly accessing private data when training ML models by employing Federated Learning (FL) [38]. The concept of FL is to allow several silos, i.e. separate parties, to train models with the same objective, using their local data and then with the learned model parameters create a central model. The objective of the central model is to obtain the information learned by the separate models, without ever accessing the raw data itself. This is achieved by averaging the separate models’ weights into the central model. FL thus facilitates opportunities to train a model on large amounts of data from several sources, without the need of having access to the data itself. If one can create a model with this methodology, that is not significantly worse than a model trained on the raw data, then positive effects such as strengthened data privacy, cross-enterprise collaboration and more could be attainable. In this work we have used a financial data set consisting of 25242 equity research reports, provided by Skandinaviska Enskilda Banken (SEB). Each report has a recommendation label, either Buy, Sell or Hold, making this a multi-class classification problem. To evaluate the feasibility of FL we fine-tune the pre-trained Transformer model AlbertForSequenceClassification [37] on the classification task. We create one baseline model using the entire data set and an FL model with different experimental settings, for which the data is distributed both uniformly and non-uniformly. The baseline model is used to benchmark the FL model. Our results indicate that the best FL setting only suffers a small reduction in performance. The baseline model achieves an accuracy of 83.5% compared to 82.8% for the best FL model setting. Further, we find that with an increased number of clients, the performance is worsened. We also found that our FL model was not sensitive to non-uniform data distributions. All in all, we show that FL results in slightly worse generalisation compared to the baseline model, while strongly improving on data privacy, as the central model never accesses the clients’ data. Företags nyttjande av maskininlärning har de senaste åren ökat signifikant och för att kunna skapa högkvalitativa modeller krävs stora mängder data, vilket kan vara svårt att insamla. Parallellt med detta så ökar också den allmänna förståelsen för hur användandet av data missbrukas, vilket har lätt till ett ökat behov av starkare datasäkerhet. 2018 så trädde General Data Protection Regulation (GDPR) i kraft inom EU, vilken bland annat ställer krav på hur företag skall hantera persondata. Företag med maskininlärningsmodeller som på något sätt använder känslig eller personlig data behöver således ha fått tillgång till denna data i enlighet med de rådande lagar och regler som omfattar datahanteringen. Ytterligare ett datarelaterat problem är då företag önskar att skapa gemensamma maskininlärningsmodeller som skulle kräva att de delar deras bolagsdata [36, 38]. Denna uppsats kommer att undersöka hur Federerad Inlärning [38] kan användas för att skapa maskinlärningsmodeller som överkommer dessa datasäkerhetsrelaterade problem. Federerad Inlärning är en metod för att på ett decentraliserat vis träna maskininlärningsmodeller. Detta omfattar att låta flera aktörer träna en modell var. Varje enskild aktör tränar respektive modell på deras isolerade data och delar sedan endast modellens parametrar till en central modell. På detta vis kan varje enskild modell bidra till den gemensamma modellen utan att den gemensamma modellen någonsin haft tillgång till den faktiska datan. Givet att en modell, skapad med Federerad Inlärning kan uppnå liknande resultat som en modell tränad på rådata, så finns många positiva fördelar så som ökad datasäkerhet och ökade samarbeten mellan företag. Under arbetet har ett dataset, bestående av 25242 finansiella rapporter tillgängliggjort av Skandinaviska Ensilda Banken (SEB) använts. Varje enskild rapport innefattar en rekommendation, antingen Köp, Sälj eller Håll, vilket innebär att vi utför muliklass-klassificering. Med datan tränas den förtränade Transformermodellen AlbertForSequence- Classification [37] på att klassificera rapporterna. En Baseline-modell, vilken har tränats på all rådata och flera Federerade modellkonfigurationer skapades, där bland annat varierande fördelningen av data mellan aktörer från att vara jämnt fördelat till vara ojämnt fördelad. Resultaten visar att den bästa Federerade modellkonfigurationen endast presterar något sämre än Baseline-modellen. Baselinemodellen uppnådde en klassificeringssäkerhet på 83.5% medan den bästa Federerade modellen uppnådde 82.8%. Resultaten visar också att den Federerade modellen inte var känslig mot att variera fördelningen av datamängd mellan aktorerna, samt att med ett ökat antal aktörer så minskar klassificeringssäkerheten. Sammanfattningsvis så visar vi att Federerad Inlärning uppnår nästan lika goda resultat som Baseline-modellen, samtidigt så bidrar metoden till avsevärt bättre datasäkerhet då den centrala modellen aldrig har tillgång till rådata.
Published: 2022

18. ALBERT-Based Self-Ensemble Model With Semisupervised Learning and Data Augmentation for Clinical Semantic Textual Similarity Calculation: Algorithm Validation Study

Author: Junyi Li, Xuejie Zhang, and Xiaobing Zhou
Subjects: Computer science, Process (engineering), education, Computer applications to medicine. Medical informatics, R858-859.7, Health Informatics, 02 engineering and technology, computer.software_genre, data sets, Field (computer science), 03 medical and health sciences, symbols.namesake, 0302 clinical medicine, Health Information Management, Semantic similarity, Similarity (psychology), 0202 electrical engineering, electronic engineering, information engineering, 030212 general & internal medicine, clinical semantic textual similarity, Original Paper, algorithm, model, business.industry, GRASP, ALBERT, Medical research, Pearson product-moment correlation coefficient, Data set, symbols, semisupervised, semantic, 020201 artificial intelligence & image processing, Artificial intelligence, self-ensemble, business, computer, Natural language processing, data augmentation
Abstract: BackgroundIn recent years, with increases in the amount of information available and the importance of information screening, increased attention has been paid to the calculation of textual semantic similarity. In the field of medicine, electronic medical records and medical research documents have become important data resources for clinical research. Medical textual semantic similarity calculation has become an urgent problem to be solved.ObjectiveThis research aims to solve 2 problems—(1) when the size of medical data sets is small, leading to insufficient learning with understanding of the models and (2) when information is lost in the process of long-distance propagation, causing the models to be unable to grasp key information.MethodsThis paper combines a text data augmentation method and a self-ensemble ALBERT model under semisupervised learning to perform clinical textual semantic similarity calculations.ResultsCompared with the methods in the 2019 National Natural Language Processing Clinical Challenges Open Health Natural Language Processing shared task Track on Clinical Semantic Textual Similarity, our method surpasses the best result by 2 percentage points and achieves a Pearson correlation coefficient of 0.92.ConclusionsWhen the size of medical data set is small, data augmentation can increase the size of the data set and improved semisupervised learning can boost the learning efficiency of the model. Additionally, self-ensemble methods improve the model performance. Our method had excellent performance and has great potential to improve related medical problems.
Published: 2021

19. ALBERT over Match-LSTM Network for Intelligent Questions Classification in Chinese

Author: Huarui Wu, Haoriqin Wang, Zhichao Liu, Xiaomin Wang, and Guocheng Zhao
Subjects: 0106 biological sciences, Normalization (statistics), match-LSTM, Computer science, Pooling, Data classification, Context (language use), Overfitting, Machine learning, computer.software_genre, 01 natural sciences, NQuAD, Feature (machine learning), natural language processing, Dropout (neural networks), business.industry, Text segmentation, ALBERT, Agriculture, 04 agricultural and veterinary sciences, classification, 040103 agronomy & agriculture, 0401 agriculture, forestry, and fisheries, Artificial intelligence, business, Agronomy and Crop Science, computer, 010606 plant biology & botany
Abstract: This paper introduces a series of experiments with an ALBERT over match-LSTM network on the top of pre-trained word vectors, for accurate classification of intelligent question answering and thus the guarantee of precise information service. To improve the performance of data classification, a short text classification method based on an ALBERT and match-LSTM model was proposed to overcome the limitations of the classification process, such as few vocabularies, sparse features, large amount of data, lots of noise and poor normalization. In the model, Jieba word segmentation tools and agricultural dictionary were selected to text segmentation, GloVe algorithm was then adopted to expand the text characteristic and weighted word vector according to the text of key vector, bi-directional gated recurrent unit was applied to catch the context feature information and multi-convolutional neural networks were finally established to gain local multidimensional characteristics of text. Batch normalization, Dropout, Global Average Pooling and Global Max Pooling were utilized to solve overfitting problem. The results showed that the model could classify questions accurately, with a precision of 96.8%. Compared with other classification models, such as multi-SVM model and CNN model, ALBERT+match-LSTM had obvious advantages in classification performance in intelligent Agri-tech information service.
Published: 2021
Full Text: View/download PDF

20. ALBERT over Match-LSTM Network for Intelligent Questions Classification in Chinese.

Author: Wang, Xiaomin, Wang, Haoriqin, Zhao, Guocheng, Liu, Zhichao, and Wu, Huarui
Subjects: *INTELLIGENT networks, *PROBLEM solving, *CLASSIFICATION, *INFORMATION services, *AGRICULTURAL implements
Abstract: This paper introduces a series of experiments with an ALBERT over match-LSTM network on the top of pre-trained word vectors, for accurate classification of intelligent question answering and thus the guarantee of precise information service. To improve the performance of data classification, a short text classification method based on an ALBERT and match-LSTM model was proposed to overcome the limitations of the classification process, such as few vocabularies, sparse features, large amount of data, lots of noise and poor normalization. In the model, Jieba word segmentation tools and agricultural dictionary were selected to text segmentation, GloVe algorithm was then adopted to expand the text characteristic and weighted word vector according to the text of key vector, bi-directional gated recurrent unit was applied to catch the context feature information and multi-convolutional neural networks were finally established to gain local multidimensional characteristics of text. Batch normalization, Dropout, Global Average Pooling and Global Max Pooling were utilized to solve overfitting problem. The results showed that the model could classify questions accurately, with a precision of 96.8%. Compared with other classification models, such as multi-SVM model and CNN model, ALBERT+match-LSTM had obvious advantages in classification performance in intelligent Agri-tech information service. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

20 results on '"ALBERT"'

1. A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model.

2. PAL-BERT: An Improved Question Answering Model.

3. Multilingual Question Answering for Malaysia History with Transformer-based Language Model

4. Healthcare Data Sensitivity Assessment Through Biomedical NLP-Driven Classification and Statistical Feature Analysis

5. A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model

6. Method for Extracting Cases Relevant to Social Issues from Web Articles to Facilitate Public Debates

7. Queries related to COVID-19: a more effective retrieval through finetuned ALBERT with BM25L question answering system

8. Medical QA Oriented Multi-Task Learning Model for Question Intent Classification and Named Entity Recognition.

9. Comparative Analysis of Various Language Models on Sentiment Analysis for Retail

10. Multi-Grained Attention Representation With ALBERT for Aspect-Level Sentiment Classification

11. A Lite Romanian BERT: ALR-BERT.

12. Comparing the accuracy of ANN with transformer models for sentiment analysis of tweets related to COVID-19 Pfizer vaccines.

13. ALBERT over Match-LSTM Network for Intelligent Questions Classification in Chinese

14. Comparing Different Transformer Models’ Performance for Identifying Toxic Language Online

15. Multi-Grained Attention Representation With ALBERT for Aspect-Level Sentiment Classification

16. Extrahering av information om vapenaffärer från nyhetsartiklar

17. Evaluering av Federerad Inlärning tillämpad på Transformers för klassificering av analytikerrapporter

18. ALBERT-Based Self-Ensemble Model With Semisupervised Learning and Data Augmentation for Clinical Semantic Textual Similarity Calculation: Algorithm Validation Study

19. ALBERT over Match-LSTM Network for Intelligent Questions Classification in Chinese

20. ALBERT over Match-LSTM Network for Intelligent Questions Classification in Chinese.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

20 results on '"ALBERT"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources