Author: "Knoth, Petr" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Knoth, Petr"' showing total 356 results

Start Over Author "Knoth, Petr"

356 results on '"Knoth, Petr"'

1. CSMeD: Bridging the Dataset Gap in Automated Citation Screening for Systematic Literature Reviews

Author: Kusa, Wojciech, Mendoza, Oscar E., Samwald, Matthias, Knoth, Petr, and Hanbury, Allan
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval
Abstract: Systematic literature reviews (SLRs) play an essential role in summarising, synthesising and validating scientific evidence. In recent years, there has been a growing interest in using machine learning techniques to automate the identification of relevant studies for SLRs. However, the lack of standardised evaluation datasets makes comparing the performance of such automated literature screening systems difficult. In this paper, we analyse the citation screening evaluation datasets, revealing that many of the available datasets are either too small, suffer from data leakage or have limited applicability to systems treating automated literature screening as a classification task, as opposed to, for example, a retrieval or question-answering task. To address these challenges, we introduce CSMeD, a meta-dataset consolidating nine publicly released collections, providing unified access to 325 SLRs from the fields of medicine and computer science. CSMeD serves as a comprehensive resource for training and evaluating the performance of automated citation screening models. Additionally, we introduce CSMeD-FT, a new dataset designed explicitly for evaluating the full text publication screening task. To demonstrate the utility of CSMeD, we conduct experiments and establish baselines on new datasets., Comment: Accepted at NeurIPS 2023 Datasets and Benchmarks Track
Published: 2023

2. CRUISE-Screening: Living Literature Reviews Toolbox

Author: Kusa, Wojciech, Knoth, Petr, and Hanbury, Allan
Subjects: Computer Science - Information Retrieval, Computer Science - Computation and Language, Computer Science - Digital Libraries
Abstract: Keeping up with research and finding related work is still a time-consuming task for academics. Researchers sift through thousands of studies to identify a few relevant ones. Automation techniques can help by increasing the efficiency and effectiveness of this task. To this end, we developed CRUISE-Screening, a web-based application for conducting living literature reviews - a type of literature review that is continuously updated to reflect the latest research in a particular field. CRUISE-Screening is connected to several search engines via an API, which allows for updating the search results periodically. Moreover, it can facilitate the process of screening for relevant publications by using text classification and question answering models. CRUISE-Screening can be used both by researchers conducting literature reviews and by those working on automating the citation screening process to validate their algorithms. The application is open-source: https://github.com/ProjectDoSSIER/cruise-screening, and a demo is available under this URL: https://citation-screening.ec.tuwien.ac.at. We discuss the limitations of our tool in Appendix A., Comment: Paper accepted at CIKM 2023. The arXiv version has an extra section about limitations in the Appendix that is not present in the ACM version
Published: 2023
Full Text: View/download PDF

3. CORE-GPT: Combining Open Access research and large language models for credible, trustworthy question answering

Author: Pride, David, Cancellieri, Matteo, and Knoth, Petr
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: In this paper, we present CORE-GPT, a novel question-answering platform that combines GPT-based language models and more than 32 million full-text open access scientific articles from CORE. We first demonstrate that GPT3.5 and GPT4 cannot be relied upon to provide references or citations for generated text. We then introduce CORE-GPT which delivers evidence-based answers to questions, along with citations and links to the cited papers, greatly increasing the trustworthiness of the answers and reducing the risk of hallucinations. CORE-GPT's performance was evaluated on a dataset of 100 questions covering the top 20 scientific domains in CORE, resulting in 100 answers and links to 500 relevant articles. The quality of the provided answers and and relevance of the links were assessed by two annotators. Our results demonstrate that CORE-GPT can produce comprehensive and trustworthy answers across the majority of scientific domains, complete with links to genuine, relevant scientific articles., Comment: 12 pages, accepted submission to TPDL2023
Published: 2023

4. Effective Matching of Patients to Clinical Trials using Entity Extraction and Neural Re-ranking

Author: Kusa, Wojciech, Mendoza, Óscar E., Knoth, Petr, Pasi, Gabriella, and Hanbury, Allan
Subjects: Computer Science - Information Retrieval, Computer Science - Computation and Language
Abstract: Clinical trials (CTs) often fail due to inadequate patient recruitment. This paper tackles the challenges of CT retrieval by presenting an approach that addresses the patient-to-trials paradigm. Our approach involves two key components in a pipeline-based model: (i) a data enrichment technique for enhancing both queries and documents during the first retrieval stage, and (ii) a novel re-ranking schema that uses a Transformer network in a setup adapted to this task by leveraging the structure of the CT documents. We use named entity recognition and negation detection in both patient description and the eligibility section of CTs. We further classify patient descriptions and CT eligibility criteria into current, past, and family medical conditions. This extracted information is used to boost the importance of disease and drug mentions in both query and index for lexical retrieval. Furthermore, we propose a two-step training schema for the Transformer network used to re-rank the results from the lexical retrieval. The first step focuses on matching patient information with the descriptive sections of trials, while the second step aims to determine eligibility by matching patient information with the criteria section. Our findings indicate that the inclusion criteria section of the CT has a great influence on the relevance score in lexical models, and that the enrichment techniques for queries and documents improve the retrieval of relevant trials. The re-ranking strategy, based on our training schema, consistently enhances CT retrieval and shows improved performance by 15\% in terms of precision at retrieving eligible trials. The results of our experiments suggest the benefit of making use of extracted entities. Moreover, our proposed re-ranking schema shows promising effectiveness compared to larger neural models, even with limited training data., Comment: Under review
Published: 2023

5. Outcome-based Evaluation of Systematic Review Automation

Author: Kusa, Wojciech, Zuccon, Guido, Knoth, Petr, and Hanbury, Allan
Subjects: Computer Science - Information Retrieval
Abstract: Current methods of evaluating search strategies and automated citation screening for systematic literature reviews typically rely on counting the number of relevant and not relevant publications. This established practice, however, does not accurately reflect the reality of conducting a systematic review, because not all included publications have the same influence on the final outcome of the systematic review. More specifically, if an important publication gets excluded or included, this might significantly change the overall review outcome, while not including or excluding less influential studies may only have a limited impact. However, in terms of evaluation measures, all inclusion and exclusion decisions are treated equally and, therefore, failing to retrieve publications with little to no impact on the review outcome leads to the same decrease in recall as failing to retrieve crucial publications. We propose a new evaluation framework that takes into account the impact of the reported study on the overall systematic review outcome. We demonstrate the framework by extracting review meta-analysis data and estimating outcome effects using predictions from ranking runs on systematic reviews of interventions from CLEF TAR 2019 shared task. We further measure how closely the obtained outcomes are to the outcomes of the original review if the arbitrary rankings were used. We evaluate 74 runs using the proposed framework and compare the results with those obtained using standard IR measures. We find that accounting for the difference in review outcomes leads to a different assessment of the quality of a system than if traditional evaluation measures were used. Our analysis provides new insights into the evaluation of retrieval results in the context of systematic review automation, emphasising the importance of assessing the usefulness of each document beyond binary relevance., Comment: Accepted at ICTIR2023
Published: 2023
Full Text: View/download PDF

6. Predicting article quality scores with machine learning: The UK Research Excellence Framework

Author: Thelwall, Mike, Kousha, Kayvan, Abdoli, Mahshid, Stuart, Emma, Makita, Meiko, Wilson, Paul, Levitt, Jonathan, Knoth, Petr, and Cancellieri, Matteo
Subjects: Computer Science - Digital Libraries, Computer Science - Artificial Intelligence
Abstract: National research evaluation initiatives and incentive schemes have previously chosen between simplistic quantitative indicators and time-consuming peer review, sometimes supported by bibliometrics. Here we assess whether artificial intelligence (AI) could provide a third alternative, estimating article quality using more multiple bibliometric and metadata inputs. We investigated this using provisional three-level REF2021 peer review scores for 84,966 articles submitted to the UK Research Excellence Framework 2021, matching a Scopus record 2014-18 and with a substantial abstract. We found that accuracy is highest in the medical and physical sciences Units of Assessment (UoAs) and economics, reaching 42% above the baseline (72% overall) in the best case. This is based on 1000 bibliometric inputs and half of the articles used for training in each UoA. Prediction accuracies above the baseline for the social science, mathematics, engineering, arts, and humanities UoAs were much lower or close to zero. The Random Forest Classifier (standard or ordinal) and Extreme Gradient Boosting Classifier algorithms performed best from the 32 tested. Accuracy was lower if UoAs were merged or replaced by Scopus broad categories. We increased accuracy with an active learning strategy and by selecting articles with higher prediction probabilities, as estimated by the algorithms, but this substantially reduced the number of scores predicted.
Published: 2022
Full Text: View/download PDF

7. Confidence estimation of classification based on the distribution of the neural network output layer

Author: Taha, Abdel Aziz, Hennig, Leonhard, and Knoth, Petr
Subjects: Computer Science - Computation and Language
Abstract: One of the most common problems preventing the application of prediction models in the real world is lack of generalization: The accuracy of models, measured in the benchmark does repeat itself on future data, e.g. in the settings of real business. There is relatively little methods exist that estimate the confidence of prediction models. In this paper, we propose novel methods that, given a neural network classification model, estimate uncertainty of particular predictions generated by this model. Furthermore, we propose a method that, given a model and a confidence level, calculates a threshold that separates prediction generated by this model into two subsets, one of them meets the given confidence level. In contrast to other methods, the proposed methods do not require any changes on existing neural networks, because they simply build on the output logit layer of a common neural network. In particular, the methods infer the confidence of a particular prediction based on the distribution of the logit values corresponding to this prediction. The proposed methods constitute a tool that is recommended for filtering predictions in the process of knowledge extraction, e.g. based on web scrapping, where predictions subsets are identified that maximize the precision on cost of the recall, which is less important due to the availability of data. The method has been tested on different tasks including relation extraction, named entity recognition and image classification to show the significant increase of accuracy achieved., Comment: Draft
Published: 2022

8. An analysis of the Microsoft Academic Graph

Author: Herrmannova, Drahomira and Knoth, Petr
Published: 2016

9. Semantometrics in coauthorship networks : fulltext-based approach for analysing patterns of research collaboration

Author: Herrmannova, Drahomira and Knoth, Petr
Published: 2015

10. Automation of Citation Screening for Systematic Literature Reviews using Neural Networks: A Replicability Study

Author: Kusa, Wojciech, Hanbury, Allan, and Knoth, Petr
Subjects: Computer Science - Information Retrieval
Abstract: In the process of Systematic Literature Review, citation screening is estimated to be one of the most time-consuming steps. Multiple approaches to automate it using various machine learning techniques have been proposed. The first research papers that apply deep neural networks to this problem were published in the last two years. In this work, we conduct a replicability study of the first two deep learning papers for citation screening and evaluate their performance on 23 publicly available datasets. While we succeeded in replicating the results of one of the papers, we were unable to replicate the results of the other. We summarise the challenges involved in the replication, including difficulties in obtaining the datasets to match the experimental setup of the original papers and problems with executing the original source code. Motivated by this experience, we subsequently present a simpler model based on averaging word embeddings that outperforms one of the models on 18 out of 23 datasets and is, on average, 72 times faster than the second replicated approach. Finally, we measure the training time and the invariance of the models when exposed to a variety of input features and random initialisations, demonstrating differences in the robustness of these approaches., Comment: Accepted at ECIR 2022
Published: 2022

11. Towards semantometrics : a new semantic similarity based measure for assessing a research publication's contribution

Author: Knoth, Petr and Herrmannova, Drahomira
Published: 2014

12. Visual search for supporting content exploration in large document collections

Author: 1st International Workshop on Mining Scientific Publications, George Washington University, Washington, DC, 14 Jun 2012, Herrmannova, Drahomira, and Knoth, Petr
Published: 2012

13. CORE : three access levels to underpin open access

Author: Knoth, Petr and Zdrahal, Zdenek
Published: 2012

14. Ranking for Learning: Studying Users’ Perceptions of Relevance, Understandability, and Engagement

Author: Ghafourian, Yasin, Hanbury, Allan, Knoth, Petr, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Alonso, Omar, editor, Cousijn, Helena, editor, Silvello, Gianmaria, editor, Marrero, Mónica, editor, Teixeira Lopes, Carla, editor, and Marchesin, Stefano, editor
Published: 2023
Full Text: View/download PDF

15. Readability Measures as Predictors of Understandability and Engagement in Searching to Learn

Author: Ghafourian, Yasin, Hanbury, Allan, Knoth, Petr, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Alonso, Omar, editor, Cousijn, Helena, editor, Silvello, Gianmaria, editor, Marrero, Mónica, editor, Teixeira Lopes, Carla, editor, and Marchesin, Stefano, editor
Published: 2023
Full Text: View/download PDF

16. Research Collaboration Analysis Using Text and Graph Features

Author: Herrmannova, Drahomira, Knoth, Petr, Stahl, Christopher, Patton, Robert, Wells, Jack, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, and Gelbukh, Alexander, editor
Published: 2023
Full Text: View/download PDF

17. Do Authors Deposit on Time? Tracking Open Access Policy Compliance

Author: Herrmannova, Drahomira, Pontika, Nancy, and Knoth, Petr
Subjects: Computer Science - Digital Libraries
Abstract: Recent years have seen fast growth in the number of policies mandating Open Access (OA) to research outputs. We conduct a large-scale analysis of over 800 thousand papers from repositories around the world published over a period of 5 years to investigate: a) if the time lag between the date of publication and date of deposit in a repository can be effectively tracked across thousands of repositories globally, and b) if introducing deposit deadlines is associated with a reduction of time from acceptance to public availability of research outputs. We show that after the introduction of the UK REF 2021 OA policy, this time lag has decreased significantly in the UK and that the policy introduction might have accelerated the UK's move towards immediate OA compared to other countries. This supports the argument for the inclusion of a time-limited deposit requirement in OA policies., Comment: \c{opyright} 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Published: 2019
Full Text: View/download PDF

18. An analysis of work saved over sampling in the evaluation of automated citation screening in systematic literature reviews

Author: Kusa, Wojciech, Lipani, Aldo, Knoth, Petr, and Hanbury, Allan
Published: 2023
Full Text: View/download PDF

19. Research Collaboration Analysis Using Text and Graph Features

Author: Herrmannova, Drahomira, primary, Knoth, Petr, additional, Stahl, Christopher, additional, Patton, Robert, additional, and Wells, Jack, additional
Published: 2023
Full Text: View/download PDF

20. Cui Bono? Cumulative Advantage in Open Access Publishing

Author: Pride, David, Cancellieri, Matteo, Knoth, Petr, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Silvello, Gianmaria, editor, Corcho, Oscar, editor, Manghi, Paolo, editor, Di Nunzio, Giorgio Maria, editor, Golub, Koraljka, editor, Ferro, Nicola, editor, and Poggi, Antonella, editor
Published: 2022
Full Text: View/download PDF

21. A Systematic Review of Data Management Platforms

Author: Boch, Michael, Gindl, Stefan, Barnett, Alan, Margetis, George, Mireles, Victor, Adamakis, Emmanouil, Knoth, Petr, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Rocha, Alvaro, editor, Adeli, Hojjat, editor, Dzemyda, Gintautas, editor, and Moreira, Fernando, editor
Published: 2022
Full Text: View/download PDF

22. Online Evaluations for Everyone: Mr. DLib's Living Lab for Scholarly Recommendations

Author: Beel, Joeran, Collins, Andrew, Kopp, Oliver, Dietz, Linus W., and Knoth, Petr
Subjects: Computer Science - Information Retrieval, Computer Science - Digital Libraries, Computer Science - Machine Learning
Abstract: We introduce the first 'living lab' for scholarly recommender systems. This lab allows recommender-system researchers to conduct online evaluations of their novel algorithms for scholarly recommendations, i.e., recommendations for research papers, citations, conferences, research grants, etc. Recommendations are delivered through the living lab's API to platforms such as reference management software and digital libraries. The living lab is built on top of the recommender-system as-a-service Mr. DLib. Current partners are the reference management software JabRef and the CORE research team. We present the architecture of Mr. DLib's living lab as well as usage statistics on the first sixteen months of operating it. During this time, 1,826,643 recommendations were delivered with an average click-through rate of 0.21%., Comment: Published at the 41st European Conference on Information Retrieval (ECIR) 2019
Published: 2018
Full Text: View/download PDF

23. Peer review and citation data in predicting university rankings, a large-scale analysis

Author: Pride, David and Knoth, Petr
Subjects: Computer Science - Digital Libraries
Abstract: Most Performance-based Research Funding Systems (PRFS) draw on peer review and bibliometric indicators, two different methodologies which are sometimes combined. A common argument against the use of indicators in such research evaluation exercises is their low correlation at the article level with peer review judgments. In this study, we analyse 191,000 papers from 154 higher education institutes which were peer reviewed in a national research evaluation exercise. We combine these data with 6.95 million citations to the original papers. We show that when citation-based indicators are applied at the institutional or departmental level, rather than at the level of individual papers, surprisingly large correlations with peer review judgments can be observed, up to r <= 0.802, n = 37, p < 0.001 for some disciplines. In our evaluation of ranking prediction performance based on citation data, we show we can reduce the mean rank prediction error by 25% compared to previous work. This suggests that citation-based indicators are sufficiently aligned with peer review results at the institutional level to be used to lessen the overall burden of peer review on national evaluation exercises leading to considerable cost savings., Comment: 12 pages, 7 tables, 2 figures. Submitted to TPDL2018
Published: 2018

24. Do Citations and Readership Identify Seminal Publications?

Author: Herrmannova, Drahomira, Patton, Robert M., Knoth, Petr, and Stahl, Christopher G.
Subjects: Computer Science - Digital Libraries, Physics - Physics and Society
Abstract: In this paper, we show that citation counts work better than a random baseline (by a margin of 10%) in distinguishing excellent research, while Mendeley reader counts don't work better than the baseline. Specifically, we study the potential of these metrics for distinguishing publications that caused a change in a research field from those that have not. The experiment has been conducted on a new dataset for bibliometric research called TrueImpactDataset. TrueImpactDataset is a collection of research publications of two types -- research papers which are considered seminal works in their area and papers which provide a literature review of a research area. We provide overview statistics of the dataset and propose to use it for validating research evaluation metrics. Using the dataset, we conduct a set of experiments to study how citation and reader counts perform in distinguishing these publication types, following the intuition that causing a change in a field signifies research contribution. We show that citation counts help in distinguishing research that strongly influenced later developments from works that predominantly discuss the current state of the art with a degree of accuracy (63%, i.e. 10% over the random baseline). In all setups, Mendeley reader counts perform worse than a random baseline., Comment: Accepted to journal Scientometrics
Published: 2018
Full Text: View/download PDF

25. Incidental or influential? - Challenges in automatically detecting citation importance using publication full texts

Author: Pride, David and Knoth, Petr
Subjects: Computer Science - Digital Libraries
Abstract: This work looks in depth at several studies that have attempted to automate the process of citation importance classification based on the publications full text. We analyse a range of features that have been previously used in this task. Our experimental results confirm that the number of in text references are highly predictive of influence. Contrary to the work of Valenzuela et al. we find abstract similarity one of the most predictive features. Overall, we show that many of the features previously described in literature are not particularly predictive. Consequently, we discuss challenges and potential improvements in the classification pipeline, provide a critical review of the performance of individual features and address the importance of constructing a large scale gold standard reference dataset.
Published: 2017

26. Classifying document types to enhance search and recommendations in digital libraries

Author: Charalampous, Aristotelis and Knoth, Petr
Subjects: Computer Science - Digital Libraries
Abstract: In this paper, we address the problem of classifying documents available from the global network of (open access) repositories according to their type. We show that the metadata provided by repositories enabling us to distinguish research papers, thesis and slides are missing in over 60% of cases. While these metadata describing document types are useful in a variety of scenarios ranging from research analytics to improving search and recommender (SR) systems, this problem has not yet been sufficiently addressed in the context of the repositories infrastructure. We have developed a new approach for classifying document types using supervised machine learning based exclusively on text specific features. We achieve 0.96 F1-score using the random forest and Adaboost classifiers, which are the best performing models on our data. By analysing the SR system logs of the CORE [1] digital library aggregator, we show that users are an order of magnitude more likely to click on research papers and thesis than on slides. This suggests that using document types as a feature for ranking/filtering SR results in digital libraries has the potential to improve user experience., Comment: 12 pages, 21st International Conference on Theory and Practise of Digital Libraries (TPDL), 2017, Thessaloniki, Greece
Published: 2017

27. Towards effective research recommender systems for repositories

Author: Knoth, Petr, Anastasiou, Lucas, Charalampous, Aristotelis, Cancellieri, Matteo, Pearce, Samuel, Pontika, Nancy, and Bayer, Vaclav
Subjects: Computer Science - Digital Libraries, Computer Science - Information Retrieval
Abstract: In this paper, we argue why and how the integration of recommender systems for research can enhance the functionality and user experience in repositories. We present the latest technical innovations in the CORE Recommender, which provides research article recommendations across the global network of repositories and journals. The CORE Recommender has been recently redeveloped and released into production in the CORE system and has also been deployed in several third-party repositories. We explain the design choices of this unique system and the evaluation processes we have in place to continue raising the quality of the provided recommendations. By drawing on our experience, we discuss the main challenges in offering a state-of-the-art recommender solution for repositories. We highlight two of the key limitations of the current repository infrastructure with respect to developing research recommender systems: 1) the lack of a standardised protocol and capabilities for exposing anonymised user-interaction logs, which represent critically important input data for recommender systems based on collaborative filtering and 2) the lack of a voluntary global sign-on capability in repositories, which would enable the creation of personalised recommendation and notification solutions based on past user interactions., Comment: In proceedings of Open Repositories 2017, Brisbane, Australia
Published: 2017

28. Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking

Author: Herrmannova, Drahomira and Knoth, Petr
Subjects: Computer Science - Information Retrieval, Computer Science - Digital Libraries
Abstract: With the growing amount of published research, automatic evaluation of scholarly publications is becoming an important task. In this paper we address this problem and present a simple and transparent approach for evaluating the importance of scholarly publications. Our method has been ranked among the top performers in the WSDM Cup 2016 Challenge. The first part of this paper describes our method. In the second part we present potential improvements to the method and analyse the evaluation setup which was provided during the challenge. Finally, we discuss future challenges in automatic evaluation of papers including the use of full-texts based evaluation methods., Comment: WSDM Cup 2016 - Entity Ranking Challenge. The 9th ACM International Conference on Web Search and Data Mining, San Francisco, CA, USA. February 22-25, 2016
Published: 2016

29. Semantometrics: Towards Fulltext-based Research Evaluation

Author: Herrmannova, Drahomira and Knoth, Petr
Subjects: Computer Science - Digital Libraries
Abstract: Over the recent years, there has been a growing interest in developing new research evaluation methods that could go beyond the traditional citation-based metrics. This interest is motivated on one side by the wider availability or even emergence of new information evidencing research performance, such as article downloads, views and Twitter mentions, and on the other side by the continued frustrations and problems surrounding the application of purely citation-based metrics to evaluate research performance in practice. Semantometrics are a new class of research evaluation metrics which build on the premise that full-text is needed to assess the value of a publication. This paper reports on the analysis carried out with the aim to investigate the properties of the semantometric contribution measure, which uses semantic similarity of publications to estimate research contribution, and provides a comparative study of the contribution measure with traditional bibliometric measures based on citation counting., Comment: 16th ACM/IEEE-CS Joint Conference on Digital Libraries, Newark, NJ, USA, June 19-23 2016
Published: 2016
Full Text: View/download PDF

30. A Systematic Review of Data Management Platforms

Author: Boch, Michael, primary, Gindl, Stefan, additional, Barnett, Alan, additional, Margetis, George, additional, Mireles, Victor, additional, Adamakis, Emmanouil, additional, and Knoth, Petr, additional
Published: 2022
Full Text: View/download PDF

31. Automation of Citation Screening for Systematic Literature Reviews Using Neural Networks: A Replicability Study

Author: Kusa, Wojciech, primary, Hanbury, Allan, additional, and Knoth, Petr, additional
Published: 2022
Full Text: View/download PDF

32. Cui Bono? Cumulative Advantage in Open Access Publishing

Author: Pride, David, primary, Cancellieri, Matteo, additional, and Knoth, Petr, additional
Published: 2022
Full Text: View/download PDF

33. Value dissonance in research(er) assessment: individual and perceived institutional priorities in review, promotion, and tenure.

Author: Ross-Hellauer, Tony, Klebel, Thomas, Knoth, Petr, and Pontika, Nancy
Abstract: There are currently broad moves to reform research assessment, especially to better incentivize open and responsible research and avoid problematic use of inappropriate quantitative indicators. This study adds to the evidence base for such decision-making by investigating researcher perceptions of current processes of research assessment in institutional review, promotion, and tenure processes. Analysis of an international survey of 198 respondents reveals a disjunct between personal beliefs and perceived institutional priorities ('value dissonance'), with practices of open and responsible research, as well as 'research citizenship' comparatively poorly valued by institutions at present. Our findings hence support current moves to reform research assessment. But we also add crucial nuance to the debate by discussing the relative weighting of open and responsible practices and suggesting that fostering research citizenship activities like collegiality and mentorship may be an important way to rebalance criteria towards environments, which better foster quality, openness, and responsibility. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Online Evaluations for Everyone: Mr. DLib’s Living Lab for Scholarly Recommendations

Author: Beel, Joeran, Collins, Andrew, Kopp, Oliver, Dietz, Linus W., Knoth, Petr, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Pandu Rangan, C., Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Azzopardi, Leif, editor, Stein, Benno, editor, Fuhr, Norbert, editor, Mayr, Philipp, editor, Hauff, Claudia, editor, and Hiemstra, Djoerd, editor
Published: 2019
Full Text: View/download PDF

35. Linking textual resources to support information discovery

Author: Knoth, Petr
Subjects: 025.04
Abstract: A vast amount of information is today stored in the form of textual documents, many of which are available online. These documents come from different sources and are of different types. They include newspaper articles, books, corporate reports, encyclopedia entries and research papers. At a semantic level, these documents contain knowledge, which was created by explicitly connecting information and expressing it in the form of a natural language. However, a significant amount of knowledge is not explicitly stated in a single document, yet can be derived or discovered by researching, i.e. accessing, comparing, contrasting and analysing, information from multiple documents. Carrying out this work using traditional search interfaces is tedious due to information overload and the difficulty of formulating queries that would help us to discover information we are not aware of. In order to support this exploratory process, we need to be able to effectively navigate between related pieces of information across documents. While information can be connected using manually curated cross-document links, this approach not only does not scale, but cannot systematically assist us in the discovery of sometimes non-obvious (hidden) relationships. Consequently, there is a need for automatic approaches to link discovery. This work studies how people link content, investigates the properties of different link types, presents new methods for automatic link discovery and designs a system in which link discovery is applied on a collection of millions of documents to improve access to public knowledge.
Published: 2015
Full Text: View/download PDF

36. Value dissonance in research(er) assessment: individual and perceived institutional priorities in review, promotion, and tenure

Author: Ross-Hellauer, Tony, primary, Klebel, Thomas, additional, Knoth, Petr, additional, and Pontika, Nancy, additional
Published: 2023
Full Text: View/download PDF

37. Prompting Strategies for Citation Classification

Author: Kunnath, Suchetha N., primary, Pride, David, additional, and Knoth, Petr, additional
Published: 2023
Full Text: View/download PDF

38. Building Scalable Digital Library Ingestion Pipelines Using Microservices

Author: Cancellieri, Matteo, Pontika, Nancy, Pearce, Samuel, Anastasiou, Lucas, Knoth, Petr, Barbosa, Simone Diniz Junqueira, Series editor, Chen, Phoebe, Series editor, Filipe, Joaquim, Series editor, Kotenko, Igor, Series editor, Sivalingam, Krishna M., Series editor, Washio, Takashi, Series editor, Yuan, Junsong, Series editor, Zhou, Lizhu, Series editor, Garoufallou, Emmanouel, editor, Virkus, Sirje, editor, Siatri, Rania, editor, and Koutsomiha, Damiana, editor
Published: 2017
Full Text: View/download PDF

39. What Others Say About This Work? Scalable Extraction of Citation Contexts from Research Papers

Author: Knoth, Petr, Gooch, Phil, Jack, Kris, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Kamps, Jaap, editor, Tsakonas, Giannis, editor, Manolopoulos, Yannis, editor, Iliadis, Lazaros, editor, and Karydis, Ioannis, editor
Published: 2017
Full Text: View/download PDF

40. Outcome-based Evaluation of Systematic Review Automation

Author: Kusa, Wojciech, primary, Zuccon, Guido, additional, Knoth, Petr, additional, and Hanbury, Allan, additional
Published: 2023
Full Text: View/download PDF

41. Effective matching of patients to clinical trials using entity extraction and neural re-ranking

Author: Kusa, Wojciech, primary, Mendoza, Óscar E., additional, Knoth, Petr, additional, Pasi, Gabriella, additional, and Hanbury, Allan, additional
Published: 2023
Full Text: View/download PDF

42. VoMBaT: A Tool for Visualising Evaluation Measure Behaviour in High-Recall Search Tasks

Author: Kusa, Wojciech, primary, Lipani, Aldo, additional, Knoth, Petr, additional, and Hanbury, Allan, additional
Published: 2023
Full Text: View/download PDF

43. Explainable online health information truthfulness in Consumer Health Search

Author: Upadhyay, Rishabh, primary, Knoth, Petr, additional, Pasi, Gabriella, additional, and Viviani, Marco, additional
Published: 2023
Full Text: View/download PDF

44. CORE: A Global Aggregation Service for Open Access Papers

Author: Knoth, Petr, primary, Herrmannova, Drahomira, additional, Cancellieri, Matteo, additional, Anastasiou, Lucas, additional, Pontika, Nancy, additional, Pearce, Samuel, additional, Gyawali, Bikash, additional, and Pride, David, additional
Published: 2023
Full Text: View/download PDF

45. CORE Dashboard: A tool for the management of open access content in repositories

Author: Knoth, Petr, Pride, David, Pavlenko, Viktoriia, and Cancellieri, Matteo
Subjects: Metadata, OR2023, Content aggregation, Interoperability, Repositories network
Abstract: CORE (https://core.ac.uk) works directly with the global network of Open Repositories, providing a range of tools and services for repositories and repository managers. This presentation introduces the newly updated CORE Repository Dashboard, a free service for repositories whose Open Access content is aggregated by CORE that delivers measurable benefits to repositories in two distinct areas. First, by providing a range of tools to assist in a range of repository management tasks and second, by greatly increasing discoverability of the repository's content. The Repository Dashboard enables managers to manage their content, provides reporting and download statistics, offers compliance checking against Open Access policies (such as REF in the United Kingdom), delivers the ability to register the repositories' OAI identifiers for global resolution with the CORE Resolver (https://oai.core.ac.uk/) and can automatically discover metadata enrichments (such as DOIs) for the repository's content from external sources and other repositories. Further, the Dashboard is the central hub for accessing other free services for data providers, such as CORE's Recommender and Discovery services.
Published: 2023
Full Text: View/download PDF

46. Rioxx 3: A Modernised Metadata Profile

Author: Walk, Paul, Macgregor, George, and Knoth, Petr
Subjects: Metadata profile, Metadata, Harvesting, OR2023, Rioxx
Abstract: Rioxx (formerly RIOXX) is a metadata application profile which was originally developed to facilitate reporting to funders in the UK. Since then, over the last 7 years it has proved useful also to aggregator services harvesting metadata records from repositories, and feedback from those services has indicated a number of ways in which Rioxx could be improved. This presentation will explain how Rioxx has a new governance group which has been working since 2019 to prepare a new version, one which is designed to meet a broader range of use-cases. We will focus on the changes we have made, including: * far greater use of persistent identifiers (PIDs); * a greater focus on the Web as the overarching context (i.e. use of HTTP(S) URIs) * greater support for expressing important "events" in the lifecycle of scholarly publications, in a response to requirements from open-access funders We will also compare and contrast Rioxx with OpenAIRE. Finally, we explain the "radically open" approach we have taken to development, involving community feedback at each stage. Rioxx is no longer a UK-specific profile, and we believe that Rioxx v3 has potential value for repositories in the global context.
Published: 2023
Full Text: View/download PDF

47. Facilitating community wide adoption of metadata standards and guidelines through validation and monitoring: The Rioxx v3 EPrints adoption use case

Author: Cancellieri, Matteo and Knoth, Petr
Subjects: funder policies, metadata, validation and monitoring, EPrints, interoperability, OR2023, Repository metadata interoperability, Rioxx, compliance, repositories, FAIR
Abstract: This presentation introduces a generic approach to supporting the repository community's implementation of new metadata standards and guidelines. We will first discuss the challenges in adopting metadata standard and guidelines across repositories in an interoperable manner, sharing our decade-long experience with harvesting and processing metadata from repositories. We will then describe the components needed to ensure a community-driven approach which we argue will lead to a better adoption across the sector, benefitting all stakeholders. leading to a proper adoption. . We will illustrate this approach on a project the CORE (core.ac.uk) team is currently carrying out to support the uptake of v3 by the Eprints community.
Published: 2023
Full Text: View/download PDF

48. Predicting Student Performance from Combined Data Sources

Author: Wolff, Annika, Zdrahal, Zdenek, Herrmannova, Drahomira, Knoth, Petr, Kacprzyk, Janusz, Series editor, and Peña-Ayala, Alejandro, editor
Published: 2014
Full Text: View/download PDF

49. Peer Review and Citation Data in Predicting University Rankings, a Large-Scale Analysis

Author: Pride, David, primary and Knoth, Petr, additional
Published: 2018
Full Text: View/download PDF

50. Value dissonance in research(er) assessment: Individual and institutional priorities in review, promotion and tenure criteria

Author: Ross-Hellauer, Tony, primary, Klebel, Thomas, additional, Knoth, Petr, additional, and Pontika, Nancy, additional
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

356 results on '"Knoth, Petr"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources