Descriptor: "WordNet" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"WordNet"' showing total 5,419 results

Start Over Descriptor "WordNet"

5,419 results on '"WordNet"'

1. WordNet Expansion with Bilingual Word Embeddings and Neural Machine Translation

Author: Abuín, Marta Vázquez, Garcia, Marcos, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Santos, Manuel Filipe, editor, Machado, José, editor, Novais, Paulo, editor, Cortez, Paulo, editor, and Moreira, Pedro Miguel, editor
Published: 2025
Full Text: View/download PDF

2. Integrating YOLO and WordNet for automated image object summarization.

Author: Saqib, Sheikh Muhammad, Aftab, Aamir, Mazhar, Tehseen, Iqbal, Muhammad, Shahazad, Tariq, Almogren, Ahmad, and Hamam, Habib
Abstract: The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. An Efficient Ant Colony Optimization Optimized Deep Belief Network Based Text Summarization Using Diverse Beam Search Computation for Social Media Content Extraction.

Author: Vinitha M. and Vasundra S.
Subjects: ANT algorithms, TEXT summarization, SOCIAL media, MACHINE learning, RECOMMENDER systems, K-means clustering
Abstract: Social media is a platform for sharing various hashtags, news, and posts within the community. Increasing text content and information to read and understand is difficult because of the vast definitions. At present, every business must analyze data in social media. Social media platforms are where people worldwide discuss and share social commentary. Automated text summarization is crucial for condensing lengthy contents into concise ones using learning concepts. The unstructured data in social media are significant phrases with support from sources for analyzing sentiments and extracting the importance of the content. Previously, all data were analyzed as related content similarity matches. However, the main drawback is that data needs to be examined to rephrase or summarize the essential terms of extraction. Due to improper content extraction, the accuracy gets poor in precision level to increase the false content rate. To tackle these problems, in this paper presents a Machine Learning (ML) intelligence algorithm with Diverse Beam Search-Based Maximum Mutual Information (DBSMMI) and Ant Colony Optimization (ACO)-optimized Deep Belief Network Based Text Summarization (DBNTS). Initially, the COVID-19 Twitter dataset is preprocessed to remove noise and create a Pheromone value set based on the k-means semantic similarity algorithm. Our work analyzes and clusters the data according to their theme (area). Data analysis is the central concept and is performed using WordNet keyword matching and semantic matching of the words. Then, the similar word is clustered using a semantic similarity-based k-means clustering algorithm. Using DBSMMI to make identical content phrases maximum supports term sentence extraction. The maximum support clustered group is optimized for the respective theme using ant colony optimization with the DBNTS algorithm. The algorithm's efficiency can be tested with an existing classifier algorithms. The ACO semantic recommender system is implemented in this article to recommend relevant news to the Twitter user. The proposed simulation attains the 92.35% of accuracy, and 90.29% of precision performance. The proposed method efficiently improves classification accuracy, and precision performance compared to other methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. A hybrid model to improve IC-related metrics of semantic similarity between words.

Author: Xiao, Jia
Subjects: WILCOXON signed-rank test, STATISTICAL correlation, RANK correlation (Statistics), CONFIDENCE intervals, SAMPLE size (Statistics)
Abstract: This paper proposes a hybrid model to improve Information Content (IC) related metrics of semantic similarity between words, named IC+SP, based on the essential hypothesis that IC and the shortest path are two relatively independent semantic evidences and have approximately equal influences to the semantic similarity metric. The paradigm of IC+SP is to linearly combine the IC-related metric and the shortest path. Meanwhile, a transformation from the semantic similarity of the concepts to that of the words is presented by maximizing every component of IC+SP. 13 improved IC-related metrics based on IC+SP are formed and implemented on the experimental platform HESML Lastra-Díaz (Inf Syst 66:97–118, 2017). Pearson's and Spearman's correlation coefficients on well-accepted benchmarks for the improved metrics compare to those for the original ones to evaluate IC+SP. I introduce the Wilcoxon Signed-Rank Test needing no standard distribution hypothesis, while, this hypothesis is required by T-Test on the sample of small size. T-Test, as well as the Wilcoxon Signed-Rank Test, conduct on the differences of the correlative coefficients for improved and original metrics. It is expected that the improved IC-related metrics could significantly outperform their corresponding original ones, and the experimental results, including the comparisons of mean and maximum of correlation coefficients as well as the p-value and confidence interval of both tests, accomplish the anticipation in the vast majority of cases. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Naïve Bayes classifier for Kashmiri word sense disambiguation.

Author: Mir, Tawseef Ahmad and Lawaye, Aadil Ahmad
Abstract: Many applications of Natural Language Processing (NLP) like machine translation, document clustering, and information retrieval make use of Word Sense Disambiguation (WSD). WSD automatically predicts the sense of an ambiguous word that exactly fits it as per the given situation. While it may seem very easy for humans to interpret the meaning of natural language, machines require the processing of huge amounts of data for similar tasks. In this paper, we propose an automatic WSD system for the Kashmiri language based on the Naive Bayes classifier. This work is the first attempt towards developing a WSD system for the Kashmiri language to the best of our knowledge. Bag-of-Words (BoW) and Part-of-Speech (PoS) based features are used in this study for developing the WSD system. Experiments are carried out on a manually crafted sense-tagged dataset for 60 ambiguous Kashmiri words. These 60 words are selected based on the frequency in the raw corpus collected. Senses for annotation purposes of these ambiguous words are extracted from Kashmiri WordNet. The performance of the proposed system is measured using accuracy, precision, recall and F-1 measure metrics. The proposed WSD model reported the best performance (accuracy = 89.92, precision = 0.84, recall = 0.89, F-1 measure = 0.86) when both PoS and BoW features were used at the same time. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. System Fusion Based on WordNet Word Sense Disambiguation.

Author: Duan, Mengtao and Luan, Tingyan
Subjects: EXPERIMENTAL groups, CONTROL groups
Abstract: In the realm of natural language processing (NLP), Word Sense Disambiguation (WSD) is a crucial task, and WSD systems are used in many NLP applications. Systems built on WordNet (e.g., Lesk) have been witnessing encouraging progress in the domain of word sense disambiguation. Yet, the performance of WordNet based WSD systems may have limits when disambiguating polysemous words. The purpose of this research was to investigate the discrepancies between a systematic fusing approach of WordNet WSD and a single best performing system. In the experimental test, the fusing approach was used to disambiguate, and in the control test, a single best disambiguating system was used to disambiguate. The accuracies, recalls, and disambiguation times of two groups were compared after the two groups were tested by the same test dataset. The result of the experiment is that the performance accuracy and recall of the experimental group is better than that of the control group. The decision result of the multiple systems was fused to strengthen the performance accuracy and comprehensive of the system. At the disambiguation time, the experimental group showed a worthy disambiguation rate of disambiguation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Bridging Natural Language Processing and psycholinguistics: computationally grounded semantic similarity datasets for Basque and Spanish

Author: Josu Goikoetxea, Itziar San Martin, and Miren Arantzeta
Subjects: WordNet, text, psycholinguistic features, word similarity, embeddings, nouns, Language and Literature
Abstract: IntroductionSemantic relations are crucial in various cognitive processes, highlighting the need to understand concept interactions and how such relations are represented in the brain. Psycholinguistics research requires computationally grounded datasets that include word similarity measures controlled for the variables that play a significant role in lexical processing. This work presents a dataset for noun pairs in Basque and European Spanish based on two well-known Natural Language Processing resources: text corpora and knowledge bases.MethodsThe dataset creation consisted of three steps, (1) computing four key psycholinguistic features for each noun; concreteness, frequency, semantic, and phonological neighborhood density; (2) pairing nouns across these four variables; (3) for each noun pair, assigning three types of word similarity measurements, computed out of text, Wordnet and hybrid embeddings.ResultsA dataset of noun pairs in Basque and Spanish involving three types of word similarity measurements, along with four lexical features for each of the nouns in the pair, namely, word frequency, concreteness, and semantic and phonological neighbors. The selection of the nouns for each pair was controlled by the mentioned variables, which play a significant role in lexical processing. The dataset includes three similarity measurements, based on their embedding computation: semantic relatedness from text-based embeddings, pure similarity from Wordnet-based embeddings and both categorical and associative relations from hybrid embeddings.DiscussionThe present work covers an existent gap in Basque and Spanish in terms of the lack of datasets that include both word similarity and detailed lexical properties, which provides a more useful resource for psycholinguistics research in those languages.
Published: 2024
Full Text: View/download PDF

8. BioBERT for Multiple Knowledge-Based Question Expansion and Biomedical Extractive Question Answering

Author: Gabsi, Imen, Kammoun, Hager, Wederni, Asma, Amous, Ikram, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Nguyen, Ngoc Thanh, editor, Franczyk, Bogdan, editor, Ludwig, André, editor, Núñez, Manuel, editor, Treur, Jan, editor, Vossen, Gottfried, editor, and Kozierkiewicz, Adrianna, editor
Published: 2024
Full Text: View/download PDF

9. Word2Vec-GloVe-BERT Embeddings for Query Expansion

Author: Gabsi, Imen, Kammoun, Hager, Mtar, Rawed, Amous, Ikram, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Abraham, Ajith, editor, Bajaj, Anu, editor, Hanne, Thomas, editor, and Hong, Tzung-Pei, editor
Published: 2024
Full Text: View/download PDF

10. Knowledge Graph-Based Evaluation Metric for Conversational AI Systems: A Step Towards Quantifying Semantic Textual Similarity

Author: Gaur, Rajat, Dwivedi, Ankit, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Dhar, Suparna, editor, Goswami, Sanjay, editor, Dinesh Kumar, U., editor, Bose, Indranil, editor, Dubey, Rameshwar, editor, and Mazumdar, Chandan, editor
Published: 2024
Full Text: View/download PDF

11. Multi-objective Black-Box Test Case Prioritization Based on Wordnet Distances

Author: van Dinten, Imara, Zaidman, Andy, Panichella, Annibale, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Arcaini, Paolo, editor, Yue, Tao, editor, and Fredericks, Erik M., editor
Published: 2024
Full Text: View/download PDF

12. Advances Toward Word-Sense Disambiguation

Author: Mir, Tawseef Ahmad, Lawaye, Aadil Ahmad, Ahmed, Ghayas, Rana, Parveen, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Hassanien, Aboul Ella, editor, Castillo, Oscar, editor, Anand, Sameer, editor, and Jaiswal, Ajay, editor
Published: 2024
Full Text: View/download PDF

13. Data Augmentation Techniques in Automatic Translation of Vietnamese Sign Language for the Deaf

Author: Do, Duy Cop, Ho, Thi Tuyen, Nguyen, Thi Bich Diep, Magjarević, Ratko, Series Editor, Ładyżyński, Piotr, Associate Editor, Ibrahim, Fatimah, Associate Editor, Lackovic, Igor, Associate Editor, Rock, Emilio Sacristan, Associate Editor, Vo, Van Toi, editor, Nguyen, Thi-Hiep, editor, Vong, Binh Long, editor, Le, Ngoc Bich, editor, and Nguyen, Thanh Qua, editor
Published: 2024
Full Text: View/download PDF

14. A hybrid model to improve IC-related metrics of semantic similarity between words

Author: Jia Xiao
Subjects: Semantic similarity, Ontology, Information content, WordNet, Wilcoxon Signed-Rank Test, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract This paper proposes a hybrid model to improve Information Content (IC) related metrics of semantic similarity between words, named IC+SP, based on the essential hypothesis that IC and the shortest path are two relatively independent semantic evidences and have approximately equal influences to the semantic similarity metric. The paradigm of IC+SP is to linearly combine the IC-related metric and the shortest path. Meanwhile, a transformation from the semantic similarity of the concepts to that of the words is presented by maximizing every component of IC+SP. 13 improved IC-related metrics based on IC+SP are formed and implemented on the experimental platform HESML Lastra-Díaz (Inf Syst 66:97–118, 2017). Pearson’s and Spearman’s correlation coefficients on well-accepted benchmarks for the improved metrics compare to those for the original ones to evaluate IC+SP. I introduce the Wilcoxon Signed-Rank Test needing no standard distribution hypothesis, while, this hypothesis is required by T-Test on the sample of small size. T-Test, as well as the Wilcoxon Signed-Rank Test, conduct on the differences of the correlative coefficients for improved and original metrics. It is expected that the improved IC-related metrics could significantly outperform their corresponding original ones, and the experimental results, including the comparisons of mean and maximum of correlation coefficients as well as the p-value and confidence interval of both tests, accomplish the anticipation in the vast majority of cases.
Published: 2024
Full Text: View/download PDF

15. A hybrid semantic recommender system based on an improved clustering.

Author: Bahrani, Payam, Minaei-Bidgoli, Behrouz, Parvin, Hamid, Mirzarezaee, Mitra, and Keshavarz, Ahmad
Subjects: *RECOMMENDER systems, *K-nearest neighbor classification, *SCALABILITY
Abstract: A recommender system is a model that automatically recommends some meaningful cases (such as clips/films/goods/items) to the clients/people/consumers/users according to their (previous) interests. These systems are expected to recommend the items according to the users' interests. There are two traditional general recommender system models, i.e., Collaborative Filtering Recommender System (ColFRS) and Content-based Filtering Recommender System (ConFRS). Also, there is another model that is a hybrid of those two traditional recommender systems; it is called Hybrid Recommender System (HRS). An HRS usually outperforms simple traditional recommender systems. The problems such as scalability, cold start, and sparsity belong to the main problems that any recommender system should solve. The memory-based (modeless) recommender systems benefit from good accuracies. But they suffer from a lack of admissible scalability. The model-based recommender systems suffer from a lack of admissible accuracies. But they benefit from good scalability. In this paper, it is tried to propose a hybrid model based on an automatically improved ontology to solve the scalability, cold start, and sparsity problems. Our proposed HRS also uses an innovative approach of clustering as an augmented section. When there are enough ratings, it uses a collaborative filtering approach to predict the missing ratings. When there are not enough ratings, it uses a content-based filtering approach to predict the missing ratings. In the content-based filtering section of our proposed HRS, ontology concepts are used to improve the accuracy of ratings' prediction. If our target client is severely sparse, we cannot trust even the ratings predicted by the content-based filtering section of our proposed HRS. Therefore, our proposed HRS uses additive clustering to improve the prediction of the missing ratings if the target client is severely sparse. It is experimentally shown that our model outperforms many of the newly developed recommender systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Machine Hands on Flaws to Machine: The Surprising Sources of Biases in Machine Learning Models.

Author: Steinfeld, Kyle
Subjects: GENERATIVE adversarial networks, REINFORCEMENT learning, COLLEGE teachers, ARTIFICIAL intelligence
Abstract: After musing on the history and varying media of the concept of 'gone viral', Associate Professor of Architecture at the University of California, Berkeley, Kyle Steinfeld further investigates computational design through the lens of cultural practices. Even the seemingly most contemporary and innovative technological ideas and gizmos can be traced back to a series of legacy notions that remain silently present in new advances. The article discusses such 'hinge' moments and searches for them in AI. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Query reformulation system based on WordNet and word vectors clusters.

Author: Jumde, Amol and Keskar, Ravindra
Subjects: *CHATBOTS, *MATHEMATICAL reformulation, *SEARCH engines, *CHATGPT, *VECTOR spaces, *PERSONAL assistants, *INTERNET users
Abstract: With tremendous evolution in the internet world, the internet has become a household thing. Internet users use search engines or personal assistants to request information from the internet. Search results are greatly dependent on the entered keywords. Casual users may enter a vague query due to lack of knowledge of the domain-specific words. We propose a query reformulation system that determines the context of the query, decides on keywords to be replaced and outputs a better-modified query. We propose strategies for keyword replacements and metrics for query betterment checks. We have found that if we project keywords into the vector space of word projection using word embedding techniques and if the keyword replacement is correct, clusters of a new set of keywords become more cohesive. This assumption forms the basis of our proposed work. To prove the effectiveness of the proposed system, we applied it to the ad-hoc retrieval tasks over two benchmark corpora viz TREC-CDS 2014 and OHSUMED corpus. We indexed Whoosh search engine on these corpora and evaluated based on the given queries provided along with the corpus. Experimental results show that the proposed techniques achieved 9 to 11% improvement in precision and recall scores. Using Google's popularity index, we also prove that the reformulated queries are not only more accurate but also more popular. The proposed system also applies to Conversational AI chatbots like ChatGPT, where users must rephrase their queries to obtain better results. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Internetowy słownik języka zawodowego polskich dziennikarzy prasowych. Koncepcja tezaurusa dziedzinowego typu wordnet - preliminaria.

Author: Jarosz, Beata
Abstract: Copyright of Prace Jezykoznawcze is the property of University of Warmia & Mazury in Olsztyn and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

19. Towards a Multimodal WordNet for Language Learning in Bulgarian

Author: Petya Osenova and Kiril Simov
Subjects: Wordnet, Sub-lexicons, Language Learning, Images, Bulgarian, Information technology, T58.5-58.64
Abstract: In this paper we present some modifications to and extensions of a Wordnet for Bulgarian designed to make it more appropriate for applications in the area of language learning. However, in order to support education, we need to ensure the appropriate selection of sets of synonyms (synsets) from BTB- Wordnet, depending on the education level of the learners, and various types of exercises based on integration of the learning topic and semantic information within Wordnet. For this purpose, our focus is mainly on the combination of the lexemes (lemmas), with their meanings and examples, and the specially designed pictures as illustrations of those meanings within the synsets. We report on our preliminary results.
Published: 2024
Full Text: View/download PDF

20. Enhancing Aspect-based Sentiment Analysis with ParsBERT in Persian Language

Author: Farid Ariai, Maryam Tayefeh Mahmoudi, and Ali Moeini
Subjects: opinion mining, sentiment analysis, aspect-based sentiment analysis, lexical semantic disambiguation, wordnet, Information technology, T58.5-58.64, Computer software, QA76.75-76.765
Abstract: In the era of pervasive internet use and the dominance of social networks, researchers face significant challenges in Persian text mining, including the scarcity of adequate datasets in Persian and the inefficiency of existing language models. This paper specifically tackles these challenges, aiming to amplify the efficiency of language models tailored to the Persian language. Focusing on enhancing the effectiveness of sentiment analysis, our approach employs an aspect-based methodology utilizing the ParsBERT model, augmented with a relevant lexicon. The study centers on sentiment analysis of user opinions extracted from the Persian website 'Digikala.' The experimental results not only highlight the proposed method's superior semantic capabilities but also showcase its efficiency gains with an accuracy of 88.2% and an F1 score of 61.7. The importance of enhancing language models in this context lies in their pivotal role in extracting nuanced sentiments from user-generated content, ultimately advancing the field of sentiment analysis in Persian text mining by increasing efficiency and accuracy.
Published: 2024
Full Text: View/download PDF

21. Developing Lexico-Semantic Relations of Saraiki Nouns: A Corpus-Based Study

Author: Musarat Nazeer, Musarrat Azher, Azhar Pervaiz, and Iqra Yasmeen
Subjects: corpus-based study, saraiki nouns, lexico-semantic relations, wordnet, nlp, English literature, PR1-9680, Language. Linguistic theory. Comparative grammar, P101-410
Abstract: Saraiki, being the fourth most widely spoken language in Pakistan and being used in some parts of India and Afghanistan, is of significant geographical, historical, and cultural importance. However, it remains neglected in terms of proper documentation and identification of its unique linguistic features. The current study is centered on identifying the lexico-semantic categories of Saraiki nouns and then developing their hierarchical relationships (Miller et al., 1993). This quantitative research is designed to contribute to the process of developing Saraiki WordNet and is related to Natural Language Processing (NLP). A corpus of 3 million words was developed on the basis of data collected from different genres of the Saraiki language, including newspapers, academic essays, literary texts, and religious books. Both expansion and merge approaches were used to analyze the data. A wordlist of 1500 most occurring nouns was extracted from the corpus using Antconc 3.4.4.0, followed by manual tagging in Microsoft Excel 2010. Resultantly, 39 most occurring nouns from the wordlist were used to develop 173 related synsets, and lexico-semantic relationships among these nouns were identified with the help of 30 hierarchies (Miller et al., 1993). This study is limited to areas like Bahawalpur, Multan, and Muzaffarabad. It would be a milestone for Saraiki language learners, SWN development, Saraiki lexical resources, online SL dictionaries, and a guide for researchers.
Published: 2024

22. Conversion of the Spanish WordNet databases into a Prolog-readable format

Author: Julián-Iranzo, Pascual, Rigau, Germán, Sáenz-Pérez, Fernando, and Velasco-Crespo, Pablo
Published: 2024
Full Text: View/download PDF

23. Are We Talking about the Same Thing? Modeling Semantic Similarity between Common and Specialized Lexica in WordNet.

Author: Barbero, Chiara and Amaro, Raquel
Subjects: CODE switching (Linguistics), DATABASES, LEXICON, EXPERTISE, SHARING
Abstract: Specialized languages can activate different sets of semantic features when compared to general language or express concepts through different words according to the domain. The specialized lexicon, i.e., lexical units that denote more specific concepts and knowledge emerging from specific domains, however, co-exists with the common lexicon, i.e., the set of lexical units that denote concepts and knowledge shared by the average speakers, regardless of their specific training or expertise. Communication between specialists and non-specialists can show a big gap between language(s), and therefore lexical units, used by the two groups. However, quite often, semantic and conceptual overlapping between specialized and common lexical units occurs and, in many cases, the specialized and common units refer to close concepts or even point to the same reality. Considering the modeling of meaning in functional lexical resources, this paper puts forth a solution that links common and specialized lexica within the WordNet model framework. We propose a new relation expressing semantic proximity between common and specialized units and define the conditions for its establishment. Besides contributing to the observation and understanding of the process of knowledge specialization and its reflex on the lexicon, the proposed relation allows for the integration of specialized and non-specialized lexicons into a single database, contributing directly to improving communication in specialist/non-specialist contexts, such as teaching–learning situations or health professional-patient interactions, among many others, where code-switching is frequent and necessary. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. Plagiarism Detection System by Semantic and Syntactic Analysis Based on Latent Dirichlet Allocation Algorithm.

Author: Nahar, Khalid M. O., Alshtaiwi, Ma’moun, Alikhashashneh, Enas, Shatnawi, Nahlah, Al-Shannaq, Moy’awiah A., Abual-Rub, Mohammed, and BaniIsmail, Basel
Subjects: NATURAL language processing, PLAGIARISM, LATENT variables, ALGORITHMS
Abstract: The process of plagiarism detection is one of the challenges in revealing the originality of a document, especially in the fields of science and research. Natural language processing methods can recognize and determine the level of similarity between different documents. In this paper, we tackle the task of extrinsic plagiarism detection based on semantic and syntactic approaches. The objective is to identify segments of a document that show strong similarity with a group of reference documents dealing with the same topic. In this paper, we present our hybrid approach that implements semantic and syntactic features based on Latent Dirichlet Allocation (LDA) and Wu & Plamer algorithm. The proposed approach has been evaluated on a PAN13 public dataset with a total accuracy of 85%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. A hybrid semantic recommender system enriched with an imputation method.

Author: Bahrani, Payam, Minaei-Bidgoli, Behrouz, Parvin, Hamid, Mirzarezaee, Mitra, and Keshavarz, Ahmad
Abstract: Recommender systems are widely used in many applications. They can be viewed as the predictor systems that are to suggest accurate and highly preferred items to consumers or clients. These systems can be considered to be information filtering systems. They counter some important challenges such as cold start (it means the absence of enough data for a new item to make accurate recommendations), scalability, and sparsity. The memory-based recommender systems have high accuracy but lack scalability. Also, the model-based systems are scalable but not accurate. Current recommender systems use hybrid methods to deal with the most important shortages of traditional filtering approaches. Current recommender systems are usually a hybrid of content-based filtering and collaborative filtering, and so on. In this paper, a hybrid recommender system is presented to meet the stated challenges, increase system performance and provide more accurate recommendations. This system uses both content-based filtering and collaborative filtering. In addition, using an automatically collected wordnet, we create an ontology that has been used in the content-based filtering section of our proposed approach. Furthermore, this framework applies KNN (k nearest neighbors) algorithm and clustering to improve its functionality. The proposed system is evaluated on a real benchmark. The experimentations show the proposed method has a better performance compared with the current superior related methods. The experimentations also show that our recommender system has desirable scalability compared with the state-of-the-art recommender systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. A new ontology-based similarity approach for measuring caching coverages provided by mediation systems.

Author: Ajarroud, Ouafa, Zellou, Ahmed, and Idri, Ali
Subjects: CACHE memory, ONTOLOGIES (Information retrieval), SEMANTIC computing, ONOMASIOLOGY, SYNTAX (Grammar)
Abstract: Most mediation systems use a caching policy in order to overcome their performance challenges. One of the most widely adopted strategies is known as semantic caching. Semantic caches are called so because they store the descriptions of all submitted queries. Although they may seem to be based on semantics because of their name, this is not really the case. In fact, they actually compare the syntax of the cached queries to the syntax of the new query to retrieve responses from the cache. This can lead to significant delays, especially if multiple requests are stored in the cache. In this work, we propose a new semantic approach based on ontologies to compute the semantic similarity between two given queries, and we provide also a new algorithm to filter all regions of the cache that do not semantically cover a user query. In this way, the use of the cache would be optimal and fast at the same time, despite the large number of regions in the cache. In fact, only the most beneficial regions will be processed to retrieve data from the cache. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Detection of Hate Speech in Assamese Text

Author: Baruah, Nomi, Gogoi, Arjun, Neog, Mandira, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Kumar, Sandeep, editor, Hiranwal, Saroj, editor, Purohit, S.D., editor, and Prasad, Mukesh, editor
Published: 2023
Full Text: View/download PDF

28. Evaluating a Synthetic Image Dataset Generated with Stable Diffusion

Author: Stöckl, Andreas, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Yang, Xin-She, editor, Sherratt, R. Simon, editor, Dey, Nilanjan, editor, and Joshi, Amit, editor
Published: 2023
Full Text: View/download PDF

29. Automatic Text Summarization Using Word Embeddings

Author: Antony, Sophiya, Pankaj, Dhanya S., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Sharma, Neha, editor, Goje, Amol, editor, Chakrabarti, Amlan, editor, and Bruckstein, Alfred M., editor
Published: 2023
Full Text: View/download PDF

30. GujAGra: An Acyclic Graph to Unify Semantic Knowledge, Antonyms, and Gujarati–English Translation of Input Text

Author: Patel, Margi, Joshi, Brijendra Kumar, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Singh, Pradeep, editor, Singh, Deepak, editor, Tiwari, Vivek, editor, and Misra, Sanjay, editor
Published: 2023
Full Text: View/download PDF

31. Semantic Similarity based Automated Answer Script Evaluation System using Machine Learning Pipeline and Natural Language Processing

Author: Shabariram, C. P., Priya Ponnuswamy, P., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Smys, S., editor, Tavares, João Manuel R. S., editor, and Shi, Fuqian, editor
Published: 2023
Full Text: View/download PDF

32. Using Classifier Ensembles to Predict Election Results Using Twitter Data Sentiment Analysis

Author: Sharma, Pinki, Kumar, Santosh, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Mahapatra, Rajendra Prasad, editor, Peddoju, Sateesh K., editor, Roy, Sudip, editor, and Parwekar, Pritee, editor
Published: 2023
Full Text: View/download PDF

33. Word Sense Disambiguation from English to Indic Language: Approaches and Opportunities

Author: Mishra, Binod Kumar, Jain, Suresh, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Patel, Kanubhai K., editor, Santosh, K. C., editor, and Patel, Atul, editor
Published: 2023
Full Text: View/download PDF

34. HanaNLG: A Flexible Hybrid Approach for Natural Language Generation

Author: Barros, Cristina, Lloret, Elena, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, and Gelbukh, Alexander, editor
Published: 2023
Full Text: View/download PDF

35. Exploiting Metonymy from Available Knowledge Resources

Author: Gonzalez-Dios, Itziar, Álvez, Javier, Rigau, German, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, and Gelbukh, Alexander, editor
Published: 2023
Full Text: View/download PDF

36. Russian Emotional Concepts in the Multilingual Technological Environment

Author: Serikov, Andrei E., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Bylieva, Daria, editor, and Nordmann, Alfred, editor
Published: 2023
Full Text: View/download PDF

37. Retrospective Inspection for Research in Natural Language Processing of Hindi Language Using Fuzzy Logic

Author: Vij, Sonakshi, Virmani, Deepali, Jain, Amita, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Tuba, Milan, editor, Akashe, Shyam, editor, and Joshi, Amit, editor
Published: 2023
Full Text: View/download PDF

38. A Survey of Different Approaches for Word Sense Disambiguation

Author: Ransing, Rasika, Gulati, Archana, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Fong, Simon, editor, Dey, Nilanjan, editor, and Joshi, Amit, editor
Published: 2023
Full Text: View/download PDF

39. Semantic-Based Feature Extraction and Feature Selection in Digital Library User Behaviour Dataset

Author: Fernandez, F. Mary Harin, Punithavathi, I. S. Hephzi, Ramana, T. Venkata, Ramana, K. Venkata, Xhafa, Fatos, Series Editor, Smys, S., editor, Lafata, Pavel, editor, Palanisamy, Ram, editor, and Kamel, Khaled A., editor
Published: 2023
Full Text: View/download PDF

40. A Novel Hysynset-Based Topic Modeling Approach for Marathi Language

Author: Bafna, Prafulla B., Saini, Jatinderkumar R., Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Choudrie, Jyoti, editor, Mahalle, Parikshit, editor, Perumal, Thinagaran, editor, and Joshi, Amit, editor
Published: 2023
Full Text: View/download PDF

41. A Graph-Based Extractive Assamese Text Summarization

Author: Baruah, Nomi, Sarma, Shikhar Kr., Borkotokey, Surajit, Borah, Randeep, Phukan, Rakhee D., Gogoi, Arjun, Xhafa, Fatos, Series Editor, Asari, Vijayan K., editor, Singh, Vijendra, editor, Rajasekaran, Rajkumar, editor, and Patel, R. B., editor
Published: 2023
Full Text: View/download PDF

42. User's learning capability aware E-content recommendation system for enhanced learning experience

Author: P. Vijayakumar and G. Jagatheeshkumar
Subjects: E-Learning, Recommendation system, Learning experience, Classification, WordNet, Electric apparatus and materials. Electric circuits. Electric networks, TK452-454.4
Abstract: E-learning is inevitable during these pandemic days and most of the learners find it comfortable to learn online. However, the main challenge is to locate the appropriate data in line with the learner's requirement. Considering the necessity of this issue, this article presents an e-content recommendation system that considers the user's learning capability. This work categorizes the documents into three categories such as basic, intermediate and advanced levels. Based on the users' learning capability, corresponding documents are recommended and this idea enhances the overall learning experience. This work is based on three phases such as data pre-processing, feature extraction and classification. The collected documents are pre-processed for preparing the documents suitable for further processes. Features such as Parts–OF–Speech (POS) tagging, Term Frequency - Inverse Document Frequency (TF-IDF) and semantic similarity based on WordNet are extracted and the multiclass Support Vector Machine (SVM) is employed for distinguishing between the classes. The performance of the work is tested and the results prove the efficacy of the work with 98 % accuracy rates, in contrast to the comparative techniques.
Published: 2024
Full Text: View/download PDF

43. Ensemble-Based Short Text Similarity: An Easy Approach for Multilingual Datasets Using Transformers and WordNet in Real-World Scenarios.

Author: Gagliardi, Isabella and Artese, Maria Teresa
Subjects: LANGUAGE models, CULTURAL property
Abstract: When integrating data from different sources, there are problems of synonymy, different languages, and concepts of different granularity. This paper proposes a simple yet effective approach to evaluate the semantic similarity of short texts, especially keywords. The method is capable of matching keywords from different sources and languages by exploiting transformers and WordNet-based methods. Key features of the approach include its unsupervised pipeline, mitigation of the lack of context in keywords, scalability for large archives, support for multiple languages and real-world scenarios adaptation capabilities. The work aims to provide a versatile tool for different cultural heritage archives without requiring complex customization. The paper aims to explore different approaches to identifying similarities in 1- or n-gram tags, evaluate and compare different pre-trained language models, and define integrated methods to overcome limitations. Tests to validate the approach have been conducted using the QueryLab portal, a search engine for cultural heritage archives, to evaluate the proposed pipeline. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

44. Multi-knowledge resources-based semantic similarity models with application for movie recommender system.

Author: Huang, Guangjian, Zhu, Xingtu, Wasti, Shahbaz Hassan, and Jiang, Yuncheng
Subjects: AMBIGUITY, RECOMMENDER systems, RESEARCH personnel
Abstract: In recent years, researchers have proposed several feature-based methods to measure semantic similarity using knowledge resources like Wikipedia and WordNet. While Wikipedia covers millions of concepts with multiple features, it has some limitations such as articles with limited content and concept ambiguity. Disambiguating these concepts remains a challenge. Conversely, WordNet offers unambiguous terms by covering all possible senses, making it a useful resource for disambiguating Wikipedia concepts. Additionally, WordNet can enrich the limited content of Wikipedia articles. Thus, we present a new approach that combines both resources to enhance previous feature-based methods of semantic similarity. We begin by analyzing the limitations of previous research, followed by introducing a novel method to disambiguate Wikipedia concepts using WordNet's synonym structure, resulting in more effective disambiguation. Furthermore, we use WordNet to supplement the features in Wikipedia articles and redefine the feature similarity functions. Finally, we train non-linear fitting-based models to measure semantic similarity. Our approach outperforms other previous methods on various benchmarks. To further showcase our approach, we apply our models to develop a movie recommender system using the MovieLens dataset, which consistently outperforms other systems. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

45. Specialised language and conceptual knowledge in lexicographic portals.

Author: Giacomini, Laura
Abstract: Copyright of Lexicographica is the property of De Gruyter and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2023
Full Text: View/download PDF

46. Lexical Semantics Identification Using Fuzzy Centrality Measures and BERT Embedding

Author: Jain, Minni, Jindal, Rajni, and Jain, Amita
Published: 2024
Full Text: View/download PDF

47. Reversal of the Word Sense Disambiguation Task Using a Deep Learning Model

Author: Algirdas Laukaitis
Subjects: word sense disambiguation, natural language processing, WordNet, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Word sense disambiguation (WSD) remains a persistent challenge in the natural language processing (NLP) community. While various NLP packages exist, the Lesk algorithm in the NLTK library demonstrates suboptimal accuracy. In this research article, we propose an innovative methodology and an open-source framework that effectively addresses the challenges of WSD by optimizing memory usage without compromising accuracy. Our system seamlessly integrates WSD into NLP tasks, offering functionality similar to that provided by the NLTK library. However, we go beyond the existing approaches by introducing a novel idea related to WSD. Specifically, we leverage deep neural networks and consider the language patterns learned by these models as the new gold standard. This approach suggests modifying existing semantic dictionaries, such as WordNet, to align with these patterns. Empirical validation through a series of experiments confirmed the effectiveness of our proposed method, achieving state-of-the-art performance across multiple WSD datasets. Notably, our system does not require the installation of additional software beyond the well-known Python libraries. The classification model is saved in a readily usable text format, and the entire framework (model and data) is publicly available on GitHub for the NLP research community.
Published: 2024
Full Text: View/download PDF

48. Monitoring semantic relatedness and revealing fairness and biases through trend tests

Author: Jean-Rémi Bourguet and Adama Sow
Subjects: Semantic relatedness, Fairness, Biases, WordNet, ReVerb, Visualization, Information technology, T58.5-58.64
Abstract: An emerging application domain concerning content-based recommender systems provides a better consideration of the semantics behind textual descriptions. Traditional approaches often miss relevant information due to their sole focus on syntax. However, the Semantic Web community has enriched resources with cultural and linguistic background knowledge, offering new standards for word categorization. This paper proposes a framework that combines the information extractor ReVerb with the WordNet taxonomy to monitor global semantic relatedness scores. Additionally, an experimental validation confronts human-based semantic relatedness scores with theoretical ones, employing Mann–Kendall trend tests to reveal fairness and biases. Overall, our framework introduces a novel approach to semantic relatedness monitoring by providing valuable insights into fairness and biases.
Published: 2025
Full Text: View/download PDF

49. Semantic-Based Integrated Plagiarism Detection Approach for English Documents.

Author: Kaur, Manpreet, Gupta, Vishal, and Kaur, Ravreet
Subjects: *PLAGIARISM, *NATURAL language processing, *PERFORMANCE standards
Abstract: The proposed work models a novel plagiarism detection system based on the semantic features to uncover the cases of plagiarism. The system constructs the dynamic relation matrix for each suspicious and source sentence pair to measure the degree of similarity using semantic features. Two Weighted Inverse Distance and GlossDice procedures show several text properties (synonyms, shortest path, etc.) to overcome the limitations of the existing features and new similarity metric for plagiarism detection are presented in this paper. Moreover, this research investigates the independent performance of various features to detect plagiarized cases and combine the best features by assigning different weight contributions to further enhance the system performance. Weighted Inverse Distance integrated with SynJaccard boosts the system performance and shows promising results. Initially, all the experiments were performed on PAN-PC-11dataset, and then PAN-14 text alignment dataset was used to validate the results of the proposed system. The effectiveness of the proposed system has been measured using standard performance measures i.e. Precision, Recall, F-measure, Granularity, and Plagdet score. The proposed system has outperformed the other baseline systems with precision (0.9459), recall (0.8861), f-measure (0.8917), and plagdet (0.8857) on the PAN-PC-11 dataset. For PAN-14 text alignment, the system exhibits precision (0.9257), recall (0.9055), f-measure (0.8931), and plagdet (0.8806). [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

50. Analysis of web data classification methods based on semantic similarity measure.

Author: Ramesh, Kante and R, Mohanasundram
Subjects: *OPTIMIZATION algorithms, *DATA analysis, *EVIDENCE gaps, *WEB search engines, *CLASSIFICATION
Abstract: In this survey, 60 research papers are reviewed based on various web data classification techniques, which are used for effective classification of web data and measuring the semantic relatedness between the two words. The web data classification techniques are classified into three types, such as semantic-based approach, search engine-based approach, and WordNet-based approach, and the research issues and challenges confronted by the existing techniques are reported in this survey. Moreover, the analysis is carried out based on the research works using the categorized web data classification techniques, dataset, and evaluation metrics are carried out. From the analysis, it is clear that semantic-based approach is the widely used techniques in the classification of web data. Similarly, Miller-Charles dataset is the most commonly used dataset in most of the research papers, and the evaluation metrics, like precision, recall, and F-measure are widely utilized in web data classification. The insights from this manuscript can be utilized to understand various research gaps and problems in this area. Those can be considered in the future by developing novel optimization algorithms, which might enhance the performance of web data classifications. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

5,419 results on '"WordNet"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources