Descriptor: "WordNet" / Publication Type: Dissertations - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"WordNet"' showing total 8 results

Start Over Descriptor "WordNet" Publication Type Dissertations

8 results on '"WordNet"'

1. A generic architecture for semantic enhanced tagging systems

Author: Magableh, Murad
Subjects: 600, tagging systems, social web, semantic web, web 2.0, web 3.0, WordNet, multilinguality, cross-language information retrieval, ontology, folksonomy, synonyms, shorthand writing, tag, tags, system tags, user tags
Abstract: The Social Web, or Web 2.0, has recently gained popularity because of its low cost and ease of use. Social tagging sites (e.g. Flickr and YouTube) offer new principles for end-users to publish and classify their content (data). Tagging systems contain free-keywords (tags) generated by end-users to annotate and categorise data. Lack of semantics is the main drawback in social tagging due to the use of unstructured vocabulary. Therefore, tagging systems suffer from shortcomings such as low precision, lack of collocation, synonymy, multilinguality, and use of shorthands. Consequently, relevant contents are not visible, and thus not retrievable while searching in tag-based systems. On the other hand, the Semantic Web, so-called Web 3.0, provides a rich semantic infrastructure. Ontologies are the key enabling technology for the Semantic Web. Ontologies can be integrated with the Social Web to overcome the lack of semantics in tagging systems. In the work presented in this thesis, we build an architecture to address a number of tagging systems drawbacks. In particular, we make use of the controlled vocabularies presented by ontologies to improve the information retrieval in tag-based systems. Based on the tags provided by the end-users, we introduce the idea of adding “system tags” from semantic, as well as social, resources. The “system tags” are comprehensive and wide-ranging in comparison with the limited “user tags”. The system tags are used to fill the gap between the user tags and the search terms used for searching in the tag-based systems. We restricted the scope of our work to tackle the following tagging systems shortcomings: - The lack of semantic relations between user tags and search terms (e.g. synonymy, hypernymy), - The lack of translation mediums between user tags and search terms (multilinguality), - The lack of context to define the emergent shorthand writing user tags. To address the first shortcoming, we use the WordNet ontology as a semantic lingual resource from where system tags are extracted. For the second shortcoming, we use the MultiWordNet ontology to recognise the cross-languages linkages between different languages. Finally, to address the third shortcoming, we use tag clusters that are obtained from the Social Web to create a context for defining the meaning of shorthand writing tags. A prototype for our architecture was implemented. In the prototype system, we built our own database to host videos that we imported from real tag-based system (YouTube). The user tags associated with these videos were also imported and stored in the database. For each user tag, our algorithm adds a number of system tags that came from either semantic ontologies (WordNet or MultiWordNet), or from tag clusters that are imported from the Flickr website. Therefore, each system tag added to annotate the imported videos has a relationship with one of the user tags on that video. The relationship might be one of the following: synonymy, hypernymy, similar term, related term, translation, or clustering relation. To evaluate the suitability of our proposed system tags, we developed an online environment where participants submit search terms and retrieve two groups of videos to be evaluated. Each group is produced from one distinct type of tags; user tags or system tags. The videos in the two groups are produced from the same database and are evaluated by the same participants in order to have a consistent and reliable evaluation. Since the user tags are used nowadays for searching the real tag-based systems, we consider its efficiency as a criterion (reference) to which we compare the efficiency of the new system tags. In order to compare the relevancy between the search terms and each group of retrieved videos, we carried out a statistical approach. According to Wilcoxon Signed-Rank test, there was no significant difference between using either system tags or user tags. The findings revealed that the use of the system tags in the search is as efficient as the use of the user tags; both types of tags produce different results, but at the same level of relevance to the submitted search terms.
Published: 2011

2. Semantic Feature Extraction Using Multi-Sense Embeddings and Lexical Chains

Author: Ruas, Terry L.
Subjects: Synsets, WordNet, MSSA, Natural language processing, Semantics, Lexical chains
Abstract: The relationship between words in a sentence often tell us more about the underlying semantic content of a document than its actual words individually. Natural language understanding has seen an increasing effort in the formation of techniques that try to produce non-trivial features, in the last few years, especially after robust word embeddings models became prominent, when they proved themselves able to capture and represent semantic relationships from massive amounts of data. These new dense vector representations indeed leverage the baseline in natural language processing, but they still fall short in dealing with intrinsic issues in linguistics, such as polysemy and homonymy. Systems that make use of natural language at its core, can be affected by a weak semantic representation of human language, resulting in inaccurate outcomes based on poor decisions. In this subject, word sense disambiguation and lexical chains have been exploring alternatives to alleviate several problems in linguistics, such as semantic representation, definitions, differentiation, polysemy, and homonymy. However, little effort is seen in combining recent advances in token embeddings (e.g. words, documents) with word sense disambiguation and lexical chains. To collaborate in building a bridge between these areas, this work proposes a collection of algorithms to extract semantic features from large corpora as its main contributions, named MSSA, MSSA-D, MSSA-NR, FLLC II, and FXLC II. The MSSA techniques focus on disambiguating and annotating each word by its specific sense, considering the semantic effects of its context. The lexical chains group derive the semantic relations between consecutive words in a document in a dynamic and pre-defined manner. These original techniques' target is to uncover the implicit semantic links between words using their lexical structure, incorporating multi-sense embeddings, word sense disambiguation, lexical chains, and lexical databases. A few natural language problems are selected to validate the contributions of this work, in which our techniques outperform state-of-the-art systems. All the proposed algorithms can be used separately as independent components or combined in one single system to improve the semantic representation of words, sentences, and documents. Additionally, they can also work in a recurrent form, refining even more their results.
Published: 2019

3. Hypernym Discovery over WordNet and English Corpora - using Hearst Patterns and Word Embeddings

Author: Vallabhajosyula, Manikya Swathi
Subjects: Hypernym, NLP, Patterns, SemEval, Taxonomy, WordNet
Abstract: Languages evolve over time. With new technical innovations, new terms get created and new senses are added to existing words. Dictionaries like WordNet which act as a database for English vocabulary should be updated with these new concepts. WordNet organizes these concepts in sets of synonyms and interlinks them by using semantic relations. Many Natural Language Processing applications like Machine Translation and Word Sense Disambiguation rely on WordNet for their functionality. WordNet was last updated in 2006. If WordNet is not updated with new vocabulary, the performance of applications which rely on WordNet would drop. The objective of our research is to automatically update WordNet with the new senses by using resources like online dictionaries and text corpora available over the internet. We use the ISA hierarchy structure of WordNet to insert new senses. In an ISA hierarchy, the concepts higher in a hierarchy (called hypernyms) are more abstract representations of the concepts lower in hierarchy (called hyponyms). To improve the coverage of our solution, we rely on two complementary techniques - traditional pattern matching and modern vector space models - to extract candidate hypernym from WordNet for a new sense. Our system was ranked 4 among the systems that participated in for this SemEval task SemEval 2016 Task 14 Semantic Taxonomy Enrichment. We also evaluate our system by participating in the task SemEval 2018 Task 09 Hypernym Discovery. In this task, we apply our system to the huge UMBC WebBase text corpus to extract candidate hypernyms for a given input term. Our system was ranked 3 among the systems which find hypernyms for Concepts.
Published: 2018

4. Language Evolves, so should WordNet - Automatically Extending WordNet with the Senses of Out of Vocabulary Lemmas

Author: Rusert, Jonathan
Subjects: Natural Language Processing, WordNet
Abstract: This thesis provides a solution which finds the optimal location to insert the sense of a word not currently found in lexical database WordNet. Currently WordNet contains common words that are already well established in the English language. However, there are many technical terms and examples of jargon that suddenly become popular, and new slang expressions and idioms that arise. WordNet will only stay viable to the degree to which it can incorporate such terminology in an automatic and reliable fashion. To solve this problem we have developed an approach which measures the relatedness of the definition of a novel sense with all of the definitions of all of senses with the same part of speech in WordNet. These measurements were done using a variety of measures, including Extended Gloss Overlaps, Gloss Vectors, and Word2Vec. After identifying the most related definition to the novel sense, we determine if this sense should be merged as a synonym or attached as a hyponym to an existing sense. Our method participated in a shared task on Semantic Taxonomy Enhancement conducted as a part of SemeEval-2016 are fared much better than a random baseline and was comparable to various other participating systems. This approach is not only effective it represents a departure from existing techniques and thereby expands the range of possible solutions to this problem.
Published: 2017

5. Semantic Similarity of Node Profiles in Social Networks

Author: Rawashdeh, Ahmad
Subjects: Computer Science, Social Networks, Wordnet, Semantic, Machine Learning, Link Prediction
Abstract: It can be said, without exaggeration, that social networks have taken a large segment of populationby a storm. Regardless of the actual geographical location, of socio-economic status, as longas access to an internet connected computer is available, a person has access to the whole world,and to a multitude of social networks. By being able to share, comment, and post on various socialnetworks sites, a user of social networks becomes a "citizen of the world", ensuring presence acrossboundaries (be they geographic, or socio-economic boundaries).At the same time social networks have brought forward many issues interesting from computingpoint of view. One of these issue is that of evaluating similarity between nodes/profiles in a socialnetwork. Such evaluation is not only interesting, but important, as the similarity underlies theformation of communities (in real life or on the web), of acquisition of friends (in real life and onthe web).In this thesis, several methods for finding similarity, including semantic similarity, are investigated,and a new approach, Wordnet-Cosine similarity is proposed. The Wordnet-Cosine similarity(and associated distance measure) combines both a lexical database, Wordnet, with Cosine similarity(from information retrieval) to find possible similar profiles in a network.In order to assess the performance of Wordnet-Cosine similarity measure, two experimentshave been conducted. The first experiment illustrates the use for Wordnet-Cosine similarity incommunity formation. Communities are considered to be clusters of profiles. The results of usingWordnet-Cosine are compared with those using four other similarity measures (also described inthis thesis). In the second set of experiments, Wordnet-Cosine was applied to the problem of linkprediction. Its performance of predicting links in a random social graph was compared with arandom link predictor and was found to achieve better accuracy.
Published: 2015

6. An empirical study of semantic similarity in WordNet and Word2Vec

Author: Handler, Abram
Subjects: Word2Vec, Natural Language Processing, WordNet, Distributional Semantics, Artificial Intelligence and Robotics, Computational Linguistics, Other Computer Engineering
Abstract: This thesis performs an empirical analysis of Word2Vec by comparing its output to WordNet, a well-known, human-curated lexical database. It finds that Word2Vec tends to uncover more of certain types of semantic relations than others -- with Word2Vec returning more hypernyms, synonomyns and hyponyms than hyponyms or holonyms. It also shows the probability that neighbors separated by a given cosine distance in Word2Vec are semantically related in WordNet. This result both adds to our understanding of the still-unknown Word2Vec and helps to benchmark new semantic tools built from word vectors.
Published: 2014

7. Inference of lexical ontologies. The LeOnI methodology

Author: Farreres, Javier
Subjects: Lexico-conceptual ontologies, WordNet, Automatic ontology building, Logistic regression
Abstract: In this article we present a method for semi-automatically deriving lexico-conceptual ontologies in other languages, given a lexico-conceptual ontology for one language and bilingual mapping resources. Our method uses a logistic regression model to combine mappings proposed by a set of classifiers (up to 17 in our implementation). The method is formally described and evaluated by means of two implementations for semiautomatically building Spanish and Thai WordNets using Princeton’s WordNet for English and conventional English–Spanish and English–Thai bilingual dictionaries.
Published: 2010

8. Information Retrieval Using Lucene and WordNet

Author: Whissel, Jhon F.
Subjects: Computer Science, Lucene, WordNet, Nutch, open source, information retrieval
Abstract: This thesis outlines the use of Apache Lucene, its subproject Nutch, and WordNet as tools for information retrieval, a science of searching through data in order to obtain knowledge that has become increasingly relevant in the Information Age. Lucene is a software library released by the Apache Software Foundation which provides information retrieval capabilities to programmers. Nutch, based on Lucene, adds Internet search functionality.WordNet is a lexical database which groups similar words into sets of synonyms, or synsets, and tracks their semantic relationships. This creates a combination of a dictionary and a thesaurus which may be browsed as such or used in software applications.Lucene and Nutch are released under the Apache Software License and are free and open source. WordNet is released under a similar license. The availability of free and open source tools such as Lucene, Nutch, and WordNet grants software developers a base from which they may create applications that can be tailored to their specifications, while simultaneously eliminating the need to create the entire code base for their projects from scratch. This practice can result in reduced programming time and lower software development costs.The remainder of this document outlines the use of these tools and then presents the methodology for the integration of their combined capabilities into a search engine capable of online information retrieval. This discussion culminates with a demonstration of how WordNet may be employed to remove search query ambiguity.
Published: 2009

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

8 results on '"WordNet"'

1. A generic architecture for semantic enhanced tagging systems

2. Semantic Feature Extraction Using Multi-Sense Embeddings and Lexical Chains

3. Hypernym Discovery over WordNet and English Corpora - using Hearst Patterns and Word Embeddings

4. Language Evolves, so should WordNet - Automatically Extending WordNet with the Senses of Out of Vocabulary Lemmas

5. Semantic Similarity of Node Profiles in Social Networks

6. An empirical study of semantic similarity in WordNet and Word2Vec

7. Inference of lexical ontologies. The LeOnI methodology

8. Information Retrieval Using Lucene and WordNet

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

8 results on '"WordNet"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources