512 results
Search Results
2. Revisiting subject classification in academic databases: A comparison of the classification accuracy of Web of Science, Scopus & Dimensions.
- Author
-
Singh, Prashasti, Piryani, Rajesh, Singh, Vivek Kumar, Pinto, David, Singh, Vivek, and Perez, Fernando
- Subjects
INFORMATION retrieval ,WEB databases ,CLASSIFICATION ,SCIENCE databases ,DATABASES ,ELECTRONIC journals - Abstract
Classification of research articles into different subject areas is an extremely important task in bibliometric analysis and information retrieval. There are primarily two kinds of subject classification approaches used in different academic databases: journal-based (aka source-level) and article-based (aka publication-level). The two popular academic databases- Web of Science and Scopus- use journal-based subject classification scheme for articles, which assigns articles into a subject based on the subject category assigned to the journal in which they are published. On the other hand, the recently introduced Dimensions database is the first large academic database that uses article-based subject classification scheme that assigns the article to a subject category based on its contents. Though the subject classification schemes of Web of Science have been compared in several studies, no research studies have been done on comparison of the article-based and journal-based subject classification systems in different academic databases. This paper aims to compare the accuracy of subject classification system of the three popular academic databases: Web of Science, Scopus and Dimensions through a large-scale user-based study. Results show that the commonly held belief of superiority of article-based subject classification over the journal-based subject classification scheme does not hold at least at the moment, as Web of Science appears to have the most accurate subject classification. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
3. Editorial.
- Author
-
Patnaik, Srikanta
- Subjects
MACHINE learning ,SOFT computing ,STOCK exchanges ,REGRESSION analysis ,INFORMATION retrieval - Published
- 2018
- Full Text
- View/download PDF
4. Special issue: Selected papers of KES2012 - Part 2 of 2.
- Author
-
Graña, M., Gonzalez-Acuña, A.I., and Zanni-Merk, C.
- Subjects
- *
SPECIAL issues of periodicals , *CONFERENCES & conventions , *THEORY of knowledge , *KNOWLEDGE management , *INFORMATION retrieval , *SEMANTICS - Abstract
The papers in this issue are a selection of the papers presented at the 16th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES2012) held on 10, 11 and 12 September 2012, in San Sebastian, Spain. The main bias for the selection of the papers has been the proposition of foundational works or reviews that focus on some specific issues of intelligent systems and knowledge engineering. The variety of the papers collected is great going from some abstract mathematical topics up to more close to the earth applications of knowledge engineering such as information retrieval. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
5. MM-FOOD: a high-dimensional index structure for efficiently querying content and concept of multimedia data.
- Author
-
Arslan, Serdar and Yazici, Adnan
- Subjects
MULTIMEDIA systems ,INFORMATION retrieval ,FUZZY algorithms ,INDEXING ,MULTIDIMENSIONAL scaling - Abstract
The semantic query problem is commonly called the semantic gap and is one of the significant problems in multimedia data retrieval. In this study, we focus on multimedia data retrieval by combining semantic information with data content to solve the semantic gap problem effectively. The main idea behind the combination of low-level content descriptors and the concept of multimedia data is to represent the content information with the semantic information by adding a low-level content descriptor as a new dimension to the index structure. This new dimension is represented by constructing an array index structure that uses a fuzzy clustering algorithm. Thus, a new high-dimensional index structure, named MM-FOOD, supporting querying of multimedia data, including fuzzy querying, is presented in this paper. This proposed index structure's construction and query algorithms are explained throughout this paper. Our experiments show that our indexing mechanism is considerably efficient compared to the basic indexing approach, which stores low-level content and semantic concept descriptors in separate structures when the data size is large. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
6. An evaluation metric for image retrieval systems, using entropy for grouped precision of relevant retrievals.
- Author
-
Gherbi, Tahar, Zeggari, Ahmed, Ahmed Seghir, Zianou, and Hachouf, Fella
- Subjects
IMAGE retrieval ,CONTENT-based image retrieval ,IMAGING systems ,IMAGE databases ,ENTROPY ,RESEARCH personnel - Abstract
Evaluating the performance of Content-Based Image Retrieval (CBIR) systems is a challenging and intricate task, even for experts in the field. The literature presents a vast array of CBIR systems, each applied to various image databases. Traditionally, automatic metrics employed for CBIR evaluation have been borrowed from the Text Retrieval (TR) domain, primarily precision and recall metrics. However, this paper introduces a novel quantitative metric specifically designed to address the unique characteristics of CBIR. The proposed metric revolves around the concept of grouping relevant images and utilizes the entropy of the retrieved relevant images. Grouping together relevant images holds great value from a user perspective, as it enables more coherent and meaningful results. Consequently, the metric effectively captures and incorporates the grouping of the most relevant outcomes, making it highly advantageous for CBIR evaluation. Additionally, the proposed CBIR metric excels in differentiating between results that might appear similar when assessed using other metrics. It exhibits a superior ability to discern subtle distinctions among retrieval outcomes. This enhanced discriminatory power is a significant advantage of the proposed metric. Furthermore, the proposed performance metric is designed to be straightforward to comprehend and implement. Its simplicity and ease of use contribute to its practicality for researchers and practitioners in the field of CBIR. To validate the effectiveness of our metric, we conducted a comprehensive comparative study involving prominent and well-established CBIR evaluation metrics. The results of this study demonstrate that our proposed metric exhibits robust discrimination power, outperforming existing metrics in accurately evaluating CBIR system performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. Telling the Story of Frontotemporal Dementia by Bibliometric Analysis.
- Author
-
Guido, Davide, Morandi, Gabriella, Palluzzi, Fernando, and Borroni, Barbara
- Subjects
FRONTOTEMPORAL dementia ,BIBLIOMETRICS ,HISTORY of medicine ,INFORMATION retrieval ,MEDICAL databases ,EPIDEMIOLOGY ,BRAIN imaging ,ANIMALS - Abstract
In this paper, we reconstructed the medical history of frontotemporal dementia (FTD) by reviewing the literature and analyzing papers with the highest impact through citation index. Several research studies and groups involved in FTD have been reviewed. An increasing amount of knowledge has been made available in the last 20 years through a large number of publications, leading to a better definition of the genetic and clinical bases of the disease. A total of 1,436 references (articles and reviews), published in 395 journals, were retrieved through the Scopus database. The two highest publication peaks (i.e., largest number of publications) were found in 2000 and 2008. The most cited papers considering both total citation number and the number of citations within the first two years after publication refer to: (i) the genetic bases of FTD, (ii) the clinical criteria that progressively refined the different FTD phenotypes, and (iii) FTD epidemiology. Advanced neuroimaging techniques, genotype-phenotype heterogeneity, and animal models gave us a broader understanding of various aspects of the disorder. These findings confirm the great interest in FTD research. The analysis of the literature might help in guiding future goals in the field. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
8. Learning hierarchical embedding space for image-text matching.
- Author
-
Sun, Hao, Qin, Xiaolin, and Liu, Xiaojing
- Subjects
- *
SUBSPACES (Mathematics) , *LEXICAL access , *IMAGE representation - Abstract
There are two mainstream strategies for image-text matching at present. The one, termed as joint embedding learning, aims to model the semantic information of both image and sentence in a shared feature subspace, which facilitates the measurement of semantic similarity but only focuses on global alignment relationship. To explore the local semantic relationship more fully, the other one, termed as metric learning, aims to learn a complex similarity function to directly output score of each image-text pair. However, it significantly suffers from more computation burden at retrieval stage. In this paper, we propose a hierarchically joint embedding model to incorporate the local semantic relationship into a joint embedding learning framework. The proposed method learns the shared local and global embedding spaces simultaneously, and models the joint local embedding space with respect to specific local similarity labels which are easy to access from the lexical information of corpus. Unlike the methods based on metric learning, we can prepare the fixed representations of both images and sentences by concatenating the normalized local and global representations, which makes it feasible to perform the efficient retrieval. And experiments show that the proposed model can achieve competitive performance when compared to the existing joint embedding learning models on two publicly available datasets Flickr30k and MS-COCO. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Scientific document retrieval using structure encoded string with trie indexing.
- Author
-
Dhar, Sourish, Roy, Sudipta, and Paul, Arnab
- Subjects
INFORMATION retrieval ,MATHEMATICAL formulas ,INDEXING ,ENCODING - Abstract
Retrieving mathematical expressions from scientific documents is a challenging task as mathematical expressions or formulae are quite different from the traditional text. Mathematical expressions are highly symbolic and complex. Moreover, the structure of a mathematical formula conveys a semantic meaning which cannot be overlooked. This paper proposes a scientific document retrieval system based on mathematical formula query. The paper explores the concept of Structure Encoded String (SES), which has been employed for mathematical expressions to capture the relations among the formula structures. A pattern based trie indexing scheme has been proposed for faster retrieval. The Jaro-Winkler Similarity has been adopted for matching and ranking. Experiments are conducted, results are reported using standard evaluation measures and compared with similar existing systems. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. Intelligent retrieval method of library document information based on hidden topic mining.
- Author
-
An, Yujie and Yan, Yuwei
- Subjects
INFORMATION retrieval ,MINES & mineral resources ,LIBRARIES - Abstract
In order to overcome the problems of retrieval accuracy and time-consuming of traditional document information retrieval methods, this paper designs an intelligent retrieval method of library document information based on hidden topic mining. Firstly, LDA model is used to mine the hidden topics of library document information, and then, based on the mining results, similarity degree of document information is calculated in inference network model. Finally, the Bayesian model is constructed in the sample space to retrieve the library literature information under the maximum retrieval space coverage. Experimental results show that, compared with traditional retrieval methods, the proposed method improves the retrieval accuracy significantly, with the highest retrieval accuracy reaching 99%, and the retrieval time is significantly reduced, indicating that the proposed method effectively improves the retrieval accuracy and timeliness. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
11. Research on data retrieval and analysis system based on Baidu reptile technology in big data era.
- Author
-
Jin, Jiangang, Elhoseny, Mohamed, and Yuan, X.
- Subjects
INFORMATION retrieval ,SYSTEM analysis ,BIG data ,WEBSITES ,DATA analysis ,SCALABILITY ,INFORMATION networks ,TEMPORAL databases - Abstract
With the rapid development of the Internet, the current Web has become the main platform for people to publish and retrieve information. How to quickly and accurately find the information required by users in a large amount of network information resources has become an urgent need of the people. Web crawlers are research fields that appear to meet this demand. Based on this, the paper designs and implements a distributed web crawler system based on the existing research work, and its goal is to provide high quality data support for the network public opinion system. The web crawler system designed and implemented in this paper solves the problems of low efficiency, poor scalability and low automation of single-machine crawlers, which improves the speed of webpage collection and data extraction precision and expands the scale of webpage collection. At the end of the article, the system related interface screenshots and test results are displayed. It can be seen from the test results that the crawler system can effectively collect dynamic web pages, and the result of automatic extraction of web pages has high precision, and also realizes the entire crawling system. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
12. A method for updating ontology-based user profile in Personalized Document Retrieval System using Bayesian networks.
- Author
-
Maleszka, Bernadetta, Nguyen, Ngoc Thanh, Szczerbicki, Edward, Trawiński, Bogdan, and Nguyen, Van Du
- Subjects
INFORMATION retrieval ,FILTERING software ,ONTOLOGIES (Information retrieval) ,SOCIAL networks ,INFORMATION networks ,SOCIAL accounting ,DATABASES - Abstract
Traditional approaches to content-based recommendation and collaborative filtering do not suffer from cold-start problem, which is a challenge to recommend items for an unknown user. In this paper we present a Personalized Document Retrieval System which takes into account a social network information about the users. The overall idea of the system is to cluster users into groups of similar interests based on theirs usage data and to determine a representative profile for each of the groups. When a new user joins the system, he or she is classified into one of existing group based on his or her user data and the representative profile of the group becomes a starting profile for the new user. This paper focuses on a method for updating ontology-based user profile using Bayesian network approach. We analyze some properties of proposed updating method and describe an idea of experimental evaluations. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
13. Chinese-Vietnamese cross-lingual event retrieval method based on knowledge distillation.
- Author
-
Gao, Shengxiang, He, Zhilei, Yu, Zhengtao, Zhu, Enchang, and Wu, Shaoyang
- Subjects
- *
INFORMATION retrieval , *KNOWLEDGE transfer , *PROBLEM solving - Abstract
Cross-lingual event retrieval is an information retrieval task aimed at cross-lingual event retrieval among multiple languages to find text or documents related to a specific event. Specific to Chinese-Vietnamese cross-language event retrieval, it involves using Chinese as a query to retrieve Vietnamese documents related to the query event. The critical issue is how to efficiently align query and document representations with limited resources. Existing cross-language pre-training models are trained on large-scale multilingual corpora, but their training goals do not include explicit language alignment tasks. Due to the uneven distribution of training corpora between different languages, these models have The problem of language bias. Therefore, this linguistic bias is also inherited in cross-lingual retrieval based on these models. To solve this problem, this paper proposes a Chinese-Vietnamese cross-lingual event retrieval method based on knowledge distillation. This approach enables the model to learn good query-document matching features from monolingual retrieval by transferring knowledge from high-resource to low-resource languages. By enhancing the alignment between queries and documents in different languages in a shared semantic space, the method improves the performance of Chinese-Vietnamese cross-lingual event retrieval. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Pediatric Pain Management Knowledge Linkages: Mapping Experiential Knowledge to Explicit Knowledge.
- Author
-
Safran, C., Reti, S., Marin, H.F., Stewart, Sam, Abidi, Syed Sibte Raza, and Finley, Allen
- Abstract
The goal of this project is to augment clinician communication by connecting it to evidence-based research, providing explicit knowledge to corroborate the experiential knowledge shared between health care practitioners. The source of tacit knowledge sharing is the Pediatric Pain Mailing List (PPML), a forum for practicing clinicians to contact peers on the subject of pain in children. The messages, dating back to 1993, are processed for pertinent information and gathered together into threads. They are then parsed and connected to a set of MeSH keywords, which is used to search Pubmed and return a set of papers that correspond to the subject being discussed. The results are presented in an online forum, providing clinicians with an arena in which they can browse the archives of the PPML and connect those conversations to pertinent medical literature. [ABSTRACT FROM AUTHOR]
- Published
- 2010
15. Institutional repositories and knowledge organization: A bibliographic study from Library and Information Science.
- Author
-
Fujita, Mariângela Spotti Lopes, Agustín-Lacruz, Carmen, Tolare, Jéssica Beatriz, Terra, Ana Lúcia, and Bueno-de-la-Fuente, Gema
- Subjects
KNOWLEDGE management ,INFORMATION architecture ,DATA mining ,INFORMATION retrieval ,DATA management - Abstract
This research presents an exploratory and descriptive study on the use of knowledge organization processes and systems in the context of repositories, published in journals indexed in databases between 2015 and 2020. The authors of these papers do not necessarily publish in specific events and journals in the Knowledge Organization area, but rather in the Library and Information Science arena. The study has been carried out in four steps: 1. Search, retrieval and selection of articles; 2. Development of a data codebook; 3. Identification and codification of topics; and, 4. Analysis of the data extracted. A final sample of 33 articles was defined. The methodology applied to determine the theme of journal articles is presented in detail, including the use of the Classification System for Knowledge Organization Literature (CSKOL). The illustrative data of disciplinarity and interdisciplinarity present in the articles object of this study are shown and discussed, regarding the predominance of certain ranges of CSKOL and regarding the diversity of the representative themes of the contents. It is possible to conclude that the use of CSKOL proves to be a suitable lens for analysing and understanding the literature on the field of knowledge organization in institutional repositories. It is shown that, in these 33 articles the themes of knowledge organization are combined with interdisciplinary themes from other areas of knowledge. In our opinion, this enriches and improves theoretical support for research development. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. An augmented semantic search tool for multilingual news analytics.
- Author
-
Harikumar, Sandhya, Sathyajit, Rohit, and Karumudi, Gnana Venkata Naga Sai Kalyan
- Subjects
INFORMATION retrieval ,NATURAL language processing ,SEARCH engines ,KEYWORD searching ,ENGLISH language - Abstract
News feeds generate colossal amount of data consisting of important information hidden in the intricacies. State of the art methods are still at infancy in providing a very generic and publicly available solution to skim through the important information in the news from various sources and an ability to search using specific keywords in different languages. This paper focuses on designing a tool to extract semantic details from news articles published through various internet sources in various languages. The semantic information is stored within DBMS for ease of organizing and retrieving the data. Further, a querying facility to search through entire articles based on the keyword or date-based search is also proposed to view the crisp content. The news articles in English, and two Indian languages - Hindi and Malayalam are considered for experimentation. The proposed strategy consists of two main components namely, Generative model creation and Query engine. Generative model aims to extract important entities and keywords along with their relevance to the article and other similar articles using Latent Dirichlet Allocation(LDA) and Named Entity Recognition(NER). Query engine is to facilitate on the fly retrieval of semantic content from the database, based on user keyword. The search engine, along with database indexing, reduces the access time to the database thereby retrieving the information in less time. Experimental results show that the proposed method is effective in terms of quality of information and time consumed for information retrieval. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. Contextual information retrieval in research articles: Semantic publishing tools for the research community.
- Author
-
Angrosh, M.A., Cranefield, Stephen, and Stanger, Nigel
- Subjects
EDUCATION research ,INFORMATION retrieval ,INFORMATION resources management ,ONLINE databases ,INFORMATION resources - Abstract
In recent years, the dramatic increase in academic research publications has gained significant research attention. Research has been carried out exploring novel ways of providing information services using this research content. However, the task of extracting meaningful information from research documents remains a challenge. This paper presents our research work on developing intelligent information systems that exploit online article databases. We present in this paper, a linked data application which uses a new semantic publishing model for providing value added information services for the research community. The paper presents a conceptual framework for modelling contexts associated with sentences in research articles and discusses the Sentence Context Ontology, which is used to convert the information extracted from research documents into machine-understandable data. The paper reports supervised learning experiments carried out using conditional probabilistic models for achieving automatic context identification. The paper also describes a Semantic Web Application that provides various citation context based information services. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
18. Meta-Heuristic Feature Optimization for ontology-based data security in a campus workplace with robotic assistance.
- Author
-
Gong, Suning, Dinesh Jackson Samuel, R., Pandian, Sanjeevi, Kumar, Priyan Malarvizhi, Pandey, Hari Mohan, and Srivastava, Gautam
- Subjects
WORK environment ,SEMANTICS ,RESEARCH evaluation ,ARTIFICIAL intelligence ,MACHINE learning ,ROBOTICS ,SOFTWARE architecture ,DATA security ,INTELLECT ,INFORMATION retrieval ,ONTOLOGIES (Information retrieval) ,DATA mining ,ALGORITHMS - Abstract
BACKGROUND: For campus workplace secure text mining, robotic assistance with feature optimization is essential. The space model of the vector is usually used to represent texts. Besides, there are still two drawbacks to this basic approach: the curse and lack of semantic knowledge. OBJECTIVES: This paper proposes a new Meta-Heuristic Feature Optimization (MHFO) method for data security in the campus workplace with robotic assistance. Firstly, the terms of the space vector model have been mapped to the concepts of data protection ontology, which statistically calculate conceptual frequency weights by term various weights. Furthermore, according to the designs of data protection ontology, the weight of theoretical identification is allocated. The dimensionality of functional areas is reduced significantly by combining standard frequency weights and weights based on data protection ontology. In addition, semantic knowledge is integrated into this process. RESULTS: The results show that the development of the characteristics of this process significantly improves campus workplace secure text mining. CONCLUSION: The experimental results show that the development of the features of the concept hierarchy structure process significantly enhances data security of campus workplace text mining with robotic assistance. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
19. A distributional semantics-based information retrieval framework for online social networks.
- Author
-
Anoop, V.S., Deepak, P., and Asharaf, S.
- Subjects
ONLINE social networks ,INFORMATION retrieval ,SOCIAL networks ,MOBILE operating systems ,ACCURACY of information - Abstract
Online social networks are considered to be one of the most disruptive platforms where people communicate with each other on any topic ranging from funny cat videos to cancer support. The widespread diffusion of mobile platforms such as smart-phones causes the number of messages shared in such platforms to grow heavily, thus more intelligent and scalable algorithms are needed for efficient extraction of useful information. This paper proposes a method for retrieving relevant information from social network messages using a distributional semantics-based framework powered by topic modeling. The proposed framework combines the Latent Dirichlet Allocation and distributional representation of phrases (Phrase2Vec) for effective information retrieval from online social networks. Extensive and systematic experiments on messages collected from Twitter (tweets) show this approach outperforms some state-of-the-art approaches in terms of precision and accuracy and better information retrieval is possible using the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
20. Building knowledge graphs from technical documents using named entity recognition and edge weight updating neural network with triplet loss for entity normalization.
- Author
-
Jeon, Sung Hwan, Lee, Hye Jin, Park, Jihye, and Cho, Sungzoon
- Subjects
- *
KNOWLEDGE graphs , *PATENT offices , *TEXT mining , *MACHINE learning , *INFORMATION retrieval - Abstract
Attempts to express information from various documents in graph form are rapidly increasing. The speed and volume in which these documents are being generated call for an automated process, based on machine learning techniques, for cost-effective and timely analysis. Past studies responded to such needs by building knowledge graphs or technology trees from the bibliographic information of documents, or by relying on text mining techniques in order to extract keywords and/or phrases. While these approaches provide an intuitive glance into the technological hotspots or the key features of the select field, there still is room for improvement, especially in terms of recognizing the same entities appearing in different forms so as to interconnect closely related technological concepts properly. In this paper, we propose to build a patent knowledge network using the United States Patent and Trademark Office (USPTO) patent filings for the semiconductor device sector by fine-tuning Huggingface's named entity recognition (NER) model with our novel edge weight updating neural network. For the named entity normalization, we employ edge weight updating neural network with positive and negative candidates that are chosen by substring matching techniques. Experiment results show that our proposed approach performs very competitively against the conventional keyword extraction models frequently employed in patent analysis, especially for the named entity normalization (NEN) and document retrieval tasks. By grouping entities with named entity normalization model, the resulting knowledge graph achieves higher scores in retrieval tasks. We also show that our model is robust to the out-of-vocabulary problem by employing the fine-tuned BERT NER model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Improving entity linking by combining semantic entity embeddings and cross-attention encoder.
- Author
-
Li, Shi and Zhang, Yongkang
- Subjects
- *
KNOWLEDGE graphs , *CONTEXTUAL learning , *INFORMATION retrieval - Abstract
Entity linking is an important task for information retrieval and knowledge graph construction. Most existing methods use a bi-encoder structure to encode mentions and entities in the same space, and learn contextual features for entity linking. However, this type of system still faces some problems: (1) the entity embedding part of the model only learns from the local context of the target entity, which is too unique for entity linking model to learn the context commonality of information; (2) the entity disambiguation part only uses similarity calculation once to determine the target entity, resulting in insufficient interaction between the mentions and candidate entities, and ineffective recall of real entities. We propose a new entity linking model based on graph neural network. Different from other bi-encoder retrieval systems, this paper introduces a fine-grained semantic enhancement information into the entity embedding part of the bi-encoder to reduce the specificity of the model. Then, the cross-attention encoder is used to re-rank the target mention and each candidate entity after the entity retrieval model. Experimental results show that although the model is not optimal in inference speed, it outperforms all baseline methods on the AIDA-CoNLL dataset, and has good generalization effects on four datasets in different fields such as MSNBC and ACE2004. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Exploring ChatGPT for next-generation information retrieval: Opportunities and challenges.
- Author
-
Huang, Yizheng and Huang, Jimmy X.
- Subjects
- *
CHATGPT , *GENERATIVE artificial intelligence , *SUPERVISED learning , *INFORMATION retrieval , *LANGUAGE models , *ARTIFICIAL intelligence - Abstract
The rapid advancement of artificial intelligence (AI) has spotlighted ChatGPT as a key technology in the realm of information retrieval (IR). Unlike its predecessors, it offers notable advantages that have captured the interest of both industry and academia. While some consider ChatGPT to be a revolutionary innovation, others believe its success stems from smart product and market strategy integration. The advent of ChatGPT and GPT-4 has ushered in a new era of Generative AI, producing content that diverges from training examples, and surpassing the capabilities of OpenAI's previous GPT-3 model. In contrast to the established supervised learning approach in IR tasks, ChatGPT challenges traditional paradigms, introducing fresh challenges and opportunities in text quality assurance, model bias, and efficiency. This paper aims to explore the influence of ChatGPT on IR tasks, providing insights into its potential future trajectory. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Span prompt dense passage retrieval for Chinese open domain question answering.
- Author
-
Fan, Chunxiao, Yan, Zhen, Wu, Yuexin, and Qian, Bing
- Subjects
LANGUAGE models ,OPEN-ended questions ,NATURAL language processing ,INFORMATION retrieval - Abstract
Dense passage retrieval is a popular method in information retrieval recently, especially in open domain question answering. It aims to retrieve related articles from massive passages to answer the question. Retriever can increase retrieval speed with less loss of accuracy compared to other methods. However, the pretrained language models used in recent research are often ineffective in semantic embedding, which will reduce accuracy. In addition, we find that contrastive learning will diverge the representation space, and Siamese models with independent parameters on both sides will decrease generalization performance. Therefore, we propose span prompt dense passage retrieval (SPDPR) based on span mask prompt tuning and parameter sharing in Chinese open-domain dense retrieval. This model can generate more efficient representation embeddings and effectively counteract the separation tendency between positive samples. We evaluate the effectiveness of SPDPR in DYKzh, as well as two Chinese datasets. SPDPR surpasses all SOTAs implemented in DYKzh and achieves a competitive result in other datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. Video localized caption generation framework for industrial videos.
- Author
-
Khurana, Khushboo and Deshpande, Umesh
- Subjects
MANUFACTURING processes ,DATA augmentation ,VIDEOS ,INFORMATION retrieval ,INFORMATION society - Abstract
In this information age, there is exponential growth in visual content and video captioning can address many real-life applications. Automatic generation of video captions can be beneficial to comprehend a video in a short time, assist in faster information retrieval, video analysis, indexing, report generation, etc. Captioning of industrial videos is of importance to get a visual and textual summary of the work ongoing in the industry. The generated captioned summary of the video can assist in remote monitoring of industries and these captions can be utilized for video question-answering, video segment extraction, productivity analysis, etc. Due to the presence of diverse events processing of industrial videos are more challenging compared to other domains. In this paper, we address the real-life application of generating the descriptions for the videos of a labor-intensive industry. We propose a keyframe-based approach for the generation of video captions. The framework produces a video summary by extraction of keyframes, thereby reducing the video captioning task to image captioning. These keyframes are passed to the image captioning model for description generation. Utilizing these individual frame captions, multi-caption descriptions of a video are generated with a unique start and end time of each caption. For image captioning, a merge encoder-decoder model with a stacked decoder for caption generation is used. We have performed experimentation on a dataset specifically created for the small-scale industry. We have also shown that data augmentation on the small dataset can greatly benefit the generation of remarkably good video descriptions. Results of extensive experimentation performed by utilizing different image encoders, language encoders, and decoders in the merge encoder-decoder model are reported. Apart from presenting the results on domain-specific data, results on domain-independent datasets are also presented to show the applicability of the technique in general. Performance comparison with existing datasets - OVSD and Flickr8k and Flickr30k are reported to demonstrate the scalability of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. Exploiting Augmented Reality and Computer Vision for Healthcare Education: The Case of Pharmaceutical Substances Visualization and Information Retrieval.
- Author
-
KOULOURIS, Dionysios, GALLOS, Parisis, MENYCHTAS, Andreas, and MAGLOGIANNIS, Ilias
- Abstract
Augmented Reality (AR) is already used as the primary visualization and user interaction tool in several scientific and business areas. At the same time new AR technologies and frameworks considerably facilitate both the development of innovative applications and also their wide adoption in different domains of everyday life. In the area of healthcare AR solutions make use of mobile or wearable devices and glasses to support, among others, education and healthcare professionals training. The aim of this paper is to present a prototype mHealth app for education, which uses AR and computer vision technologies for pharmaceutical substances recognition on drug packaging. The conceptual design of the system includes three main components which are responsible for a) Text recognition, b) Drug identification and c) AR operations for interactivity. The prototype application is available in Android or iOS platforms and has been evaluated in real-world scenarios. Camera and screen of the mobile phones fulfill the text recognition and AR operations, which eliminates the need for special equipment, while PubChem and 3D Model databases provide assets required for the drug identification and AR visualizations. The results highlight the value of AR for educational purposes, especially when combined with advanced image recognition technologies to build interactive AR encyclopedias. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. The abstract: A 'three-star' opportunity.
- Author
-
McVeigh, J. G. and Basford, J. R.
- Subjects
ABSTRACTING standards ,ABSTRACTING & indexing services ,INFORMATION retrieval ,QUALITY assurance ,SERIAL publications ,ELECTRONIC publications - Published
- 2016
- Full Text
- View/download PDF
27. Medical Knowledge Evolution Query Constraining Aspects.
- Author
-
Moen, Anne, Andersen, Stig Kjær, Aarts, Jos, Hurlen, Petter, and Eklund, Ann-Marie
- Abstract
In this paper we present a first analysis towards better understanding of the query constraining aspects of knowledge, as expressed in the most used public medical bibliographic database MEDLINE. Our results indicate, possibly not surprising, that new terms occur, but also that traditional terms are replaced by more specific ones or even go out of use as they become common knowledge. Hence, as knowledge evolve over time, search methods may benefit from becoming more sensitive to knowledge expression, to enable finding new, as well as older, relevant database contents. [ABSTRACT FROM AUTHOR]
- Published
- 2011
28. An Image Retrieval Method Based on Color-Complexity and Spatial-Histogram Features.
- Author
-
Hsien-Chu Wu and Chin-Chen Chang
- Subjects
INFORMATION retrieval ,IMAGE retrieval ,DIGITAL images ,IMAGE processing ,DATABASES - Abstract
This paper proposes two kinds of image features. One feature is the spatial-histogram feature. It combines the color histogram feature and the information about the dimensional position of pixels in an image to record the distribution of the pixels' colors that are present in different spatial positions within an image. The other image feature proposed in this paper is the color-complexity feature, which can be used to describe the change of pixel colors in the image. From the experimental results, ANMRR value is provided and we observe that the image retrieval system based on these two kinds of image features can provide a fairly good accuracy rate in image retrieval. Moreover, it has the capacity to tolerate errors; that is, for images that are damaged by rotation, shift, or color variant attacks, their similar image pairs can still be retrieved from the image database. Thus, the accuracy and flexibility of the image retrieval system are drastically improved. [ABSTRACT FROM AUTHOR]
- Published
- 2007
29. Aspect term extraction and optimized deep fuzzy clustering-based inverted indexing for document retrieval.
- Author
-
Chandwani, Gunjan, Ahlawat, Anil, and Dubey, Gaurav
- Subjects
INFORMATION retrieval ,DOCUMENT clustering ,INDEXING ,MATHEMATICAL optimization ,MOVING average process - Abstract
Finding good relevant documents for query optimization is a well-known difficulty in the field of document retrieval. This paper develops a novel approach, named Exponential Aquila Optimizer (EAO)-based Deep Fuzzy Clustering for retrieving the documents. The proposed technique effectively finds the relevant documents and tries to understand the relationship among the documents and queries in terms of the significance of documents for query optimization. Here, the Deep Fuzzy Clustering is employed for performing cluster-based inverted indexing where the Training procedure of Deep Fuzzy Clustering is done using the developed optimization algorithm, named EAO. Meanwhile, the developed EAO is newly designed by the incorporation of EWMA and AO. In addition, complex query matching is done using the Tversky index for the user-based queries, such as multigram queries and semantic queries. On the other hand, the RV coefficient is accomplished for performing query optimization for relevant document retrieval. The proposed technique achieves better performance in terms of the performance metrics, like precision, recall, and F-measure with the maximum precision of 1, maximum recall of 0.956, and maximum F-measure of 0.977, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
30. Distributed management of permission for access control model.
- Author
-
Cai, Fangbo, He, Jingsha, Ali Zardari, Zulfiqar, Han, Song, Elhoseny, Mohamed, and Yuan, X.
- Subjects
ACCESS control ,PRESSURE control ,INFORMATION retrieval ,COMPUTER network security ,INFORMATION resources ,INFORMATION storage & retrieval systems - Abstract
Access control is an important mechanism to protect sensitive information and relational system resources. The traditional access control model (TACM), such as DAC, MAC, RBAC, etc., is no longer suitable for open network due to the lack of dynamic permission management. The increasing network nodes make the information storage and resource access becoming distributed. The traditional access control model has the characteristics of low adaptive ability and single deployment and application mode due to the centralized management mode. Therefore, this access control environment inevitably puts access control pressure on access control authorization. In order to overcome the shortcomings of traditional access control model, a new access control model named DMPAC (Distributed management of permission for access control model) is proposed in the paper. The authorization mechanism of the model has a distributed and dynamic management access permission, and all nodes covered by the model have the opportunity to participate in the execution of access and control. The model DMPAC provides the benefits of traditional access control models in terms of secure access and dynamic management. We also describe the framework and execution process of the model and the application of DMPAC in access control. At last, we will present some experimental results to show that while maintaining the effectiveness of distributed access control through the management of access permissions, DMPAC can achieve the performance of traditional access control models. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
31. 2SRM: Learning social signals for predicting relevant search results.
- Author
-
Badache, Ismail
- Subjects
SOCIAL learning ,FEATURE selection ,INFORMATION retrieval ,SOCIAL networks ,SOCIAL accounting ,POPULARITY - Abstract
Search systems based on both professional meta-data (e.g., title, description, etc.) and social signals (e.g., like, comment, rating, etc.) from social networks is the trending topic in information retrieval (IR) field. This paper presents 2SRM (Social Signals Relevance Model), an approach of IR which takes into account social signals (users' actions) as an additional information to enhance a search. We hypothesize that these signals can play a role to estimate a priori social importance (relevance) of the resource (document). In this paper, we first study the impact of each such signal on retrieval performance. Next, some social properties such as popularity, reputation and freshness are quantified using several signals. The 2SRM combines the social relevance, estimated from these social signals and properties, with the conventional textual relevance. Finally, we investigate the effect of the social signals on the retrieval effectiveness using state-of-the-art learning approaches. In order to identify the most effective signals, we adopt feature selection algorithms and the correlation between the signals. We evaluated the effectiveness of our approach on both IMDb (Internet Movie Databese) and SBS (Social Book Search) datasets containing movies and books resources and their social characteristics collected from several social networks. Our experimental results are statistically significant, and reveal that incorporating social signals in retrieval model is a promising approach for improving the retrieval performance. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
32. A new aggregated search method.
- Author
-
Ma, Xiaohui and Kim, Young Ho
- Subjects
SEARCH engines ,SEARCH algorithms ,INFORMATION retrieval ,INTERNET searching ,VECTOR data ,MEDLINE - Abstract
Aggregated search is the task of integrating results from potentially multiple specialized search services, or verticals (images, videos, news, etc.), into the Web search results. Major search engines perform what is known as Aggregated Search. Aggregated search is relatively new and its advantages need to be evaluated. With the increasing size of the data set on the network, different results retrieved from different dimensions are sometimes entirely different for a given search problem. In this paper, a new aggregated search algorithm was proposed. Firstly, the retrieval data in the vertical domain was textualized, and Doc2Vec makes the vector representation of the data. Then the results are aggregated and output by dimension reduction and density clustering. The experimental results show that the model achieves good accuracy in 20 given queries and significantly improves the aggregated search results. Discussion about the results also allowed us to identify some useful thoughts concerning the evaluation of AS approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
33. Preface.
- Author
-
Kryszkiewicz, Marzena, Obiedkov, Sergei, and Raś, Zbigniew W.
- Subjects
PREFACES & forewords ,CONFERENCES & conventions ,DATA mining ,INFORMATION retrieval ,KNOWLEDGE management ,LATTICE theory - Published
- 2012
- Full Text
- View/download PDF
34. Secure retrieval method of network space data based on block chain technology.
- Author
-
Gao, Yaping and Wang, Huimin
- Subjects
BLOCKCHAINS ,DATABASES ,INFORMATION retrieval ,FAULT tolerance (Engineering) ,UPLOADING of data - Abstract
In order to solve the shortcomings of traditional methods in storage capacity, fault tolerance and time consuming, a secure data retrieval method based on block chain technology was proposed. Initialize the blockchain system, encrypt the cyberspace data through the public key, and upload the encrypted data by connecting the public key of the data aggregator. Complete the blockchain data consensus and verify the encrypted data through the aggregator workload calculation and cyberspace data verification. The blockchain data storage structure is constructed by combining the on-chain index table and the off-chain database. Decrypt the data retrieval instruction with the key, and then transfer the required data into the block chain data storage structure to achieve secure data retrieval in network space. Experimental results show that the proposed method has high storage capacity and fault tolerance, and the maximum running time is only 1.43 s. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
35. Digitizing health data for public health protection in the context of European and international coordination.
- Author
-
Tuzii, Jennifer
- Subjects
PUBLIC health surveillance ,HEALTH services administration ,INTERNATIONAL relations ,DIGITAL technology ,PRACTICAL politics ,DIGITAL health ,PUBLIC health ,INTERPROFESSIONAL relations ,GOVERNMENT policy ,INFORMATION retrieval ,FINANCIAL management ,CONTACT tracing ,COVID-19 pandemic - Abstract
BACKGROUND: The health sector has long been affected by programs, actions, plans to digitize data and care processes with a view to better protecting individual health, as well as public health, resulting in a slow and uneven development of different and often incompatible national services. OBJECTIVE: This paper aims to explore the grounds behind the urgency of turning the digital priority into concrete actions, as acknowledged by political leaders in the Rome Declaration, by explaining the capacity of digital tools to enhance healthcare management and the current obstacles. METHODS: It considers the progressive extension of the EU institutions' scope of action during the pandemic, the related supporting financial strategies launched and some examples of digital contact tracing systems. RESULTS: It emerged that the pandemic highlighted the inadequacy of purely national policies and the advantages of leveraging the digital health data processing for governance, surveillance and response to cross-border and global threats. CONCLUSIONS: Considering what emerged during the pandemic and the solemn commitment of the world's major political leaders, the solution to the still existing technical and organizational interoperability issues will no longer be postponed. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
36. A Novel classification framework for the Thirukkural for building an efficient search system.
- Author
-
Ramalingam, Anita and Navaneethakrishnan, Subalalitha Chinnaudayar
- Subjects
RANDOM forest algorithms ,INFORMATION retrieval ,SUPPORT vector machines ,NAIVE Bayes classification ,KEYWORD searching - Abstract
Thirukkural, a Tamil classic literature, which was written in 300 BCE is a didactic literature. Though Thirukkural comprises 1330 couplets which are organized into three sections and 133 chapters, in order to retrieve meaningful Thirukkural for a given query in search systems, a better organization of the Thirukkural is needed. This paper lays such a foundation by classifying the Thirukkural into ten new categories called superclasses that is helpful for building a better Information Retrieval (IR) system. The classifier is trained using Multinomial Naïve Bayes algorithm. Each superclass is further classified into two subcategories based on the didactic information. The proposed classification framework is evaluated using precision, recall and F-score metrics and achieved an overall F-score of 82.33% and a comparison analysis has been done with the Support Vector Machine, Logistic Regression and Random Forest algorithms. An IR system is built on top of the proposed system and the performance comparison has been done with the Google search and a locally built keyword search. The proposed classification framework has achieved a mean average precision score of 89%, whereas the Google search and keyword search have yielded 59% and 68% respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. A method for determining ontology-based user profile in document retrieval system.
- Author
-
Maleszka, Bernadetta
- Subjects
INFORMATION retrieval ,DOCUMENTATION ,ONTOLOGY ,INFORMATION theory ,THEORY of knowledge ,PHILOSOPHY ,SEMANTICS - Abstract
Information overload has become a very important aspect of information retrieval domain. Even if a user knows where to look for interesting information, he can have a problem with precisely formulating his information needs. A solution of the problem is personalization and recommendation system - they observe user activities, analyze them to discover important preferences. Based on these information the system can improve the effectiveness of the results. In this paper we present a method for determining user profile in a document retrieval system. We propose ontology-based profile. Such a structure allows to process semantic relations between users' queries. We focus on methods for adapting profile because only up-to-date profile can help the user to obtained results that correspond with his information needs. We present a set of postulates for adaptation methods. Performed experimental evaluations of developed methods are promising. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
38. Extracting domain-specific stopwords for text classifiers.
- Author
-
Makrehchi, Masoud and Kamel, Mohamed S.
- Subjects
STOP words ,DOMAIN-specific programming languages ,CLASSIFIERS (Linguistics) ,NATURAL language processing ,INFORMATION filtering ,INFORMATION retrieval ,TEXT mining - Abstract
In this paper, an automatic generation of domain-specific stopwords from a large labeled corpus is proposed. In the majority of text mining tasks, stopwords are removed according to a standard stopword list and/or using high and low document frequencies. In this paper, a new approach for stopword extraction, based on the notion of backward filter-level performance and data sparsity index, is proposed. First, based on the proposed model to evaluate the extracted stopwords, we examine high document frequency filtering for stopword reduction. Secondly, a new algorithm for building general and domain-specific stopword lists is proposed. For the method, it is assumed that a set of candidate stopwords must have a minimum information content and prediction capacity that is measured by the performance of a classifier. We show that to avoid obtaining the classifier performance, it can be estimated by the sparsity of the training dataset. Moreover, it is confirmed that even if a given term ranking measure can perform well for the feature selection, the measure is not necessarily efficient for selecting poor features (stopwords). According to the comparative study, the newly devised approach offers more promising results that guarantee a minimum information loss by filtering out most stopwords. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
39. Multidimensional indexing technique for medical images retrieval.
- Author
-
Safaei, Ali Asghar and Habibi-Asl, Saeede
- Subjects
IMAGE retrieval ,INFORMATION storage & retrieval systems ,DIAGNOSTIC imaging ,RECALL (Information retrieval) ,PRECISION (Information retrieval) ,SEARCH engines ,DATA structures ,METADATA - Abstract
Retrieving required medical images from a huge amount of images is one of the most widely used features in medical information systems, including medical imaging search engines. For example, diagnostic decision making has traditionally been accompanied by patient data (image or non-image) and previous medical experiences from similar cases. Indexing as part of search engines (or retrieval system), increases the speed of a search. The goal of this study, is to provide an effective and efficient indexing technique for medical images search engines. In this paper, in order to archive this goal, a multidimensional indexing technique for medical images is designed using the normalization technique that is used to reduce redundancy in relational database design. Data structure of the proposed multidimensional index and also different required operations are designed to create and handle such a multidimensional index. Time complexity of each operation is analyzed and also average memory space required to store any medical image (along with its related metadata) is calculated as the space complexity analysis of the proposed indexing technique. The results show that the proposed indexing technique has a good performance in terms of memory usage, as well as execution time for the usual operations. Moreover, and may be more important, the proposed indexing techniques improves the precision and recall of the information retrieval system (i.e., search engine) which uses this technique for indexing medical images. Besides, a user of such search engine can retrieve medical images which s/he has specified its attributes is some different aspects (dimensions), e.g., tissue, image modality and format, sickness and trauma, etc. So, the proposed multidimensional indexing techniques can improve effectiveness of a medical image information retrieval system (in terms of precision and recall), while having a proper efficiency (in terms of execution time and memory usage), and can improve the information retrieval process for healthcare search engines. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
40. An optimal storage and repair mechanism for Group Repair Code in a distributed storage environment.
- Author
-
Mittal, Swati, Rakesh, Nitin, Matam, Rakesh, and Adhikari, Ashish K.
- Subjects
INFORMATION retrieval ,BANDWIDTHS ,SYSTEM failures - Abstract
This paper aims to reduce the storage space required for data storage in a distributed storage environment and it provides an optimal repair bandwidth when a system failure occurs. Previous scientific literature suggests various approaches such as replication, erasure code, local reconstruction, regenerating codes etc. to overcome from system failure. These approaches are applied on archival storage, cloud storage etc. to provide data availability and reliability. Although, these approaches have proved efficient, but they have their own strengths and weaknesses as some of them deals with storage improvement and others focus on providing an effective repair mechanism. In this paper, we present a new approach, Group Repair Codes, which provides optimal repair bandwidth by replicating the nodes and calculating parity nodes for smaller groups. In comparison to approaches (hybrid and double code) that provide optimal repair, it utilizes less storage space. Moreover, it improves fault tolerance, disk reads and data transferred by the system in case of failure of nodes. The current study is conducted considering various existing approaches like replication, erasure codes, LRC, hybrid and double coding that were implemented to manage the big data. The results reported in the paper prove the suitability of our approach. We have also discussed the significance of intelligent system for the present study. We are intended to propose an intelligent based system for Group Repair Codes in the near future. We believe that our research will be beneficial for several communities such as cloud storage, big data and distributed storage. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
41. Electronic information management and intellectual property rights.
- Author
-
Cornish, Graham P.
- Subjects
INFORMATION resources management ,MANAGEMENT ,INTELLECTUAL property ,INTANGIBLE property ,PROPERTY rights ,INFORMATION services ,INFORMATION retrieval ,INFORMATION science - Abstract
The paper examines the idea of copyright and how it functions for both digital and non-digital publications. Various different interpretations of copyright and its application are discussed. Ideas such as databases, fair use and exceptions are explored in their relationship to technological measures used to control the use of copyright material. Examples from the CITED, COPYSMART, IMPRIMATUR, and COPICAT projects of the European Union are described briefly. The impact of the latest EU directive on copyright and the information society is explained and the need for co-operative planning and implementation of technical measures throughout the information industry is emphasised. [ABSTRACT FROM AUTHOR]
- Published
- 2005
- Full Text
- View/download PDF
42. Notes on the Use of Ontologies in the Biochemical Domain.
- Author
-
Rojas, Isabel, Ratsch, Esther, Saric, Jasmin, and Wittig, Ulrike
- Subjects
ONTOLOGY ,BIOCHEMISTRY ,CHEMISTRY ,BIOLOGY ,DATABASES ,INFORMATION retrieval - Abstract
In this paper we aim at presenting the main flavours and uses that are given to the term ontology in the bio-domains. The paper does not intend to be a thorough review of the existing work in the area. It highlights the uses that are given to ontologies in the Scientific Databases and Visualisation Group at EML Research, in Heidelberg. [ABSTRACT FROM AUTHOR]
- Published
- 2004
43. Design of compound data acquisition gateway based on 5G network.
- Author
-
Hu, Jufen and Lorenzini, Giulio
- Subjects
5G networks ,ACQUISITION of data ,GATEWAYS (Computer networks) ,INFORMATION retrieval ,INTERNET of things - Abstract
With the wide application of industrial Internet of Things, the increasing amount of data and the complexity of data types, higher requirements are put forward for the performance of data acquisition gateway. In order to reduce the data acquisition time of the gateway and improve the data retrieval coverage of the gateway, a novel design method of composite data acquisition gateway based on 5G network is proposed. Based on the analysis of related technologies, the functional requirements of the composite data acquisition gateway are summarized, and the overall design of the gateway is completed. On this basis, the gateway hardware environment is constructed by designing the main control module, 5G module and FPGA program, and then the software program is designed by designing the data acquisition driver, 5G module driver, embedded software and protocol conversion process. The experimental results show that the data retrieval coverage of the gateway designed by this method is always above 92%, which is 6% higher than that of method 1. This shows that the method significantly improves the coverage of data search, speeds up the efficiency of data collection, and improves the performance of the data collection gateway, which proves the effectiveness and feasibility of the method and is conducive to promoting the intelligent development of the data collection gateway technology. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Semantic Web technologies and bias in artificial intelligence: A systematic literature review.
- Author
-
Reyero Lobo, Paula, Daga, Enrico, Alani, Harith, and Fernandez, Miriam
- Subjects
ARTIFICIAL intelligence ,INFORMATION retrieval ,DATA mining - Abstract
Bias in Artificial Intelligence (AI) is a critical and timely issue due to its sociological, economic and legal impact, as decisions made by biased algorithms could lead to unfair treatment of specific individuals or groups. Multiple surveys have emerged to provide a multidisciplinary view of bias or to review bias in specific areas such as social sciences, business research, criminal justice, or data mining. Given the ability of Semantic Web (SW) technologies to support multiple AI systems, we review the extent to which semantics can be a "tool" to address bias in different algorithmic scenarios. We provide an in-depth categorisation and analysis of bias assessment, representation, and mitigation approaches that use SW technologies. We discuss their potential in dealing with issues such as representing disparities of specific demographics or reducing data drifts, sparsity, and missing values. We find research works on AI bias that apply semantics mainly in information retrieval, recommendation and natural language processing applications and argue through multiple use cases that semantics can help deal with technical, sociological, and psychological challenges. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Extracting Sexual Trauma Mentions from Electronic Medical Notes Using Natural Language Processing.
- Author
-
Divita, Guy, Brignone, Emily, Carter, Marjorie E., Ying Suo, Blais, Rebecca K., Samore, Matthew H., Fargo, Jamison D., and Gundlapalli, Adi V.
- Abstract
Patient history of sexual trauma is of clinical relevance to healthcare providers as survivors face adverse health-related outcomes. This paper describes a method for identifying mentions of sexual trauma within the free text of electronic medical notes. A natural language processing pipeline for information extraction was developed and scaled to handle a large corpus of electronic medical notes used for this study from US Veterans Health Administration medical facilities. The tool was used to identify sexual trauma mentions and create snippets around every asserted mention based on a domain-specific lexicon developed for this purpose. All snippets were evaluated by trained human reviewers. An overall positive predictive value (PPV) of 0.90 for identifying sexual trauma mentions from the free text and a PPV of 0.71 at the patient level are reported. The metrics are superior for records from female patients. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
46. Learning search popularity for personalized query completion in information retrieval.
- Author
-
Fei Cai, Wanyu Chen, and Xinliang Ou
- Subjects
COGNITIVE structures ,FACILITATED learning ,SEARCH algorithms ,QUERYING (Computer science) ,INFORMATION architecture - Abstract
Query completion approaches assist searchers in formulating queries with few keystrokes when using an information retrieval system to address their information needs, which help users benefit from avoiding spelling mistakes and from producing clear query formulations, etc. Previous work on query completion algorithms returns a ranked list of queries to the users mostly based on the overall observed search popularity of query candidates in the whole query logs. However, the query search popularity could be changed over time, i.e., it's time-aware. Thus, these ranking approaches based on the overall search popularity could not work very well and users may fail to find an acceptable query in the returned list, resulting in a limited search satisfaction. Hence, this paper proposes a Learning-based Personalized Query Ranking approach, i.e., LQR, where the features on the observed and predicted search popularity both in the whole logs and the recent period are exploited. Taking a pair-wise learning scenario, this paper presents a method for generating a ranked list of query candidates, and then reranks the candidates by the similarity to current search context. The experimental results show the proposed approach outperforms the baseline in terms of Mean Reciprocal Rank (MRR), reporting an average MRR improvement of 7% against the baseline. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
47. A survey of Web technology for metadata aggregation in cultural heritage.
- Author
-
Freire, Nuno, Isaac, Antoine, Robson, Glen, Brooks, John, and Manguinhas, Hugo
- Subjects
METADATA ,LINKED data (Semantic Web) ,METADATA harvesting ,CULTURAL property ,INFORMATION retrieval ,WEB search engines - Abstract
In the World Wide Web, a very large number of resources are made available through digital libraries. The existence of many individual digital libraries, maintained by different organizations, brings challenges to the discoverability and usage of these resources by potential users. A widely-used approach is metadata aggregation, where a central organization takes the role of facilitating the discoverability and use of the resources, by collecting their associated metadata. The central organization has the possibility to further promote the usage of the resources by means that cannot be efficiently undertaken by each digital library in isolation. This paper focuses in the domain of cultural heritage, where OAI-PMH has been the embraced solution, since discovery of resources was only feasible if based on metadata instead of full-text. However, the technological landscape has changed. Nowadays, with the technological improvements accomplished by network communications, computational capacity, and Internet search engines, the motivation for adopting OAI-PMH is not as clear as it used to be. In this paper, we present the results of our analysis of available potential technologies, using as application context the Europeana Network and its requirements for metadata aggregation. We cover the following technologies: IIIF (International Image Interoperability Framework); Webmention; Linked Data Notifications; WebSub; Sitemaps; ResourceSync; Open Publication Distribution System (OPDS); Linked Data Platform; and Schema.org. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
48. The use of Semantic Web technologies for decision support - a survey.
- Author
-
Blomqvist, Eva
- Subjects
SEMANTIC Web ,DECISION support systems ,ARTIFICIAL intelligence ,DECISION theory ,EXPERT systems ,INFORMATION retrieval ,INFORMATION storage & retrieval systems - Abstract
The Semantic Web shares many goals with Decision Support Systems (DSS), e.g., being able to precisely interpret information, in order to deliver relevant, reliable and accurate information to a user when and where it is needed. DSS have in addition more specific goals, since the information need is targeted towards making a particular decision, e.g., making a plan or reacting to a certain situation. When surveying DSS literature, we discover applications ranging from Business Intelligence, via general purpose social networking and collaboration support, Information Retrieval and Knowledge Management, to situation awareness, emergency management, and simulation systems. The unifying element is primarily the purpose of the systems, and their focus on information management and provision, rather than the specific technologies they employ to reach these goals. Semantic Web technologies have been used in DSS during the past decade to solve a number of different tasks, such as information integration and sharing, web service annotation and discovery, and knowledge representation and reasoning. In this survey article, we present the results of a structured literature survey of Semantic Web technologies in DSS, together with the results of interviews with DSS researchers and developers both in industry and research organizations outside the university. The literature survey has been conducted using a structured method, where papers are selected from the publisher databases of some of the most prominent conferences and journals in both fields (Semantic Web and DSS), based on sets of relevant keywords representing the intersection of the two fields. Our main contribution is to analyze the landscape of semantic technologies in DSS, and provide an overview of current research as well as open research areas, trends and new directions. An added value is the conclusions drawn from interviews with DSS practitioners, which give an additional perspective on the potential of Semantic Web technologies in this field; including scenarios for DSS, and requirements for Semantic Web technologies that may attempt to support those scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
49. P2PCF: A collaborative filtering based recommender system for peer to peer social networks.
- Author
-
Badis, Lyes, Amad, Mourad, Aïssani, Djamil, and Abbar, Sofiane
- Subjects
RECOMMENDER systems ,SOCIAL networks ,INFORMATION retrieval ,PEERS - Abstract
The recent privacy incidents reported in major media about global social networks raised real public concerns about centralized architectures. P2P social networks constitute an interesting paradigm to give back users control over their data and relations. While basic social network functionalities such as commenting, following, sharing, and publishing content are widely available, more advanced features related to information retrieval and recommendation are still challenging. This is due to the absence of a central server that has a complete view of the network. In this paper, we propose a new recommender system called P2PCF. We use collaborative filtering approach to recommend content in P2P social networks. P2PCF enables privacy preserving and tackles the cold start problem for both users and content. Our proposed approach assumes that the rating matrix is distributed within peers, in such a way that each peer only sees interactions made by her friends on her timeline. Recommendations are then computed locally within each peer before they are sent back to the requester. Our evaluations prove the effectiveness of our proposal compared to a centralized scheme in terms of recall and coverage. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
50. Independent document ranking for E-learning using semantic-based document term classification.
- Author
-
Mannar Mannan, J., Sindhanai Selvan, K., and Mohemmed Yousuf, R.
- Subjects
ELECTRONIC records ,CLASSIFICATION ,SEARCH engines ,INTERNET users ,KNOWLEDGE base - Abstract
Massive digital documents on Internet leading to use e-learning, and it becomes an emerging field of research due to the massive growth of internet users. E-learning requires suitable document ranking method to avoid navigating to the next Search Engine Result Page (SERP) frequently. The existing document ranking methods are lacking to rank the documents independently based on the conceptual contents. This paper proposes a novel method for ranking the documents independently based on the different classification of term it contains. In this approach, the terms are classified into five categories such as (1) direct query term, (2) expanded terms, (3) semantically related term, (4) supporting terms and (5) stop words. The query has been expanded using domain ontology to acquire more semantic terms for better understanding of user query. The semantic weight has been applied independently over different categories of terms in a document for ranking. The document with the highest augmented value in each category of terms has been ranked first. Remaining documents are ranked in the same way and are arranged in the descending order. The WordNet tool is utilized as a knowledge base and Wu and Palmer semantic distance method have applied for measuring semantic distance between the query and document terms for ranking the terms. The experiments show that the performance of the proposed document ranking method for e-learning retrieved better document compared with existing document ranking methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.