39 results on '"Minervini, Pasquale"'
Search Results
2. Conditional computation in neural networks: Principles and research trends.
- Author
-
Scardapane, Simone, Baiocchi, Alessandro, Devoto, Alessio, Marsocci, Valerio, Minervini, Pasquale, and Pomponi, Jary
- Subjects
MODULAR design ,SCIENTIFIC discoveries ,DESIGN - Abstract
This article summarizes principles and ideas from the emerging area of applying conditional computation methods to the design of neural networks. In particular, we focus on neural networks that can dynamically activate or de-activate parts of their computational graph conditionally on their input. Examples include the dynamic selection of, e.g., input tokens, layers (or sets of layers), and sub-modules inside each layer (e.g., channels in a convolutional filter). We first provide a general formalism to describe these techniques in an uniform way. Then, we introduce three notable implementations of these principles: mixture-of-experts (MoEs) networks, token selection mechanisms, and early-exit neural networks. The paper aims to provide a tutorial-like introduction to this growing field. To this end, we analyze the benefits of these modular designs in terms of efficiency, explainability, and transfer learning, with a focus on emerging applicative areas ranging from automated scientific discovery to semantic communication. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
- Author
-
Kaddour, Jean, Key, Oscar, Nawrot, Piotr, Minervini, Pasquale, and Kusner, Matt J.
- Subjects
Performance (cs.PF) ,FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computer Science - Performance ,Computer Science - Artificial Intelligence ,Computer Science - Neural and Evolutionary Computing ,Neural and Evolutionary Computing (cs.NE) ,Computation and Language (cs.CL) ,Machine Learning (cs.LG) - Abstract
The computation necessary for training Transformer-based language models has skyrocketed in recent years. This trend has motivated research on efficient training algorithms designed to improve training, validation, and downstream performance faster than standard training. In this work, we revisit three categories of such algorithms: dynamic architectures (layer stacking, layer dropping), batch selection (selective backprop, RHO loss), and efficient optimizers (Lion, Sophia). When pre-training BERT and T5 with a fixed computation budget using such methods, we find that their training, validation, and downstream gains vanish compared to a baseline with a fully-decayed learning rate. We define an evaluation protocol that enables computation to be done on arbitrary machines by mapping all computation time to a reference machine which we call reference system time. We discuss the limitations of our proposed protocol and release our code to encourage rigorous research in efficient training procedures: https://github.com/JeanKaddour/NoTrainNoGain.
- Published
- 2023
4. Knowledge Graph Embeddings in the Biomedical Domain: Are They Useful? A Look at Link Prediction, Rule Learning, and Downstream Polypharmacy Tasks
- Author
-
Gema, Aryo Pradipta, Grabarczyk, Dominik, De Wulf, Wolf, Borole, Piyush, Alfaro, Javier Antonio, Minervini, Pasquale, Vergari, Antonio, and Rajan, Ajitha
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Machine Learning (cs.LG) - Abstract
Knowledge graphs are powerful tools for representing and organising complex biomedical data. Several knowledge graph embedding algorithms have been proposed to learn from and complete knowledge graphs. However, a recent study demonstrates the limited efficacy of these embedding algorithms when applied to biomedical knowledge graphs, raising the question of whether knowledge graph embeddings have limitations in biomedical settings. This study aims to apply state-of-the-art knowledge graph embedding models in the context of a recent biomedical knowledge graph, BioKG, and evaluate their performance and potential downstream uses. We achieve a three-fold improvement in terms of performance based on the HITS@10 score over previous work on the same biomedical knowledge graph. Additionally, we provide interpretable predictions through a rule-based method. We demonstrate that knowledge graph embedding models are applicable in practice by evaluating the best-performing model on four tasks that represent real-life polypharmacy situations. Results suggest that knowledge learnt from large biomedical knowledge graphs can be transferred to such downstream use cases. Our code is available at https://github.com/aryopg/biokge.
- Published
- 2023
5. Logical Reasoning for Natural Language Inference Using Generated Facts as Atoms
- Author
-
Stacey, Joe, Minervini, Pasquale, Dubossarsky, Haim, Camburu, Oana-Maria, and Rei, Marek
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,I.2.7 ,Computation and Language (cs.CL) - Abstract
State-of-the-art neural models can now reach human performance levels across various natural language understanding tasks. However, despite this impressive performance, models are known to learn from annotation artefacts at the expense of the underlying task. While interpretability methods can identify influential features for each prediction, there are no guarantees that these features are responsible for the model decisions. Instead, we introduce a model-agnostic logical framework to determine the specific information in an input responsible for each model decision. This method creates interpretable Natural Language Inference (NLI) models that maintain their predictive power. We achieve this by generating facts that decompose complex NLI observations into individual logical atoms. Our model makes predictions for each atom and uses logical rules to decide the class of the observation based on the predictions for each atom. We apply our method to the highly challenging ANLI dataset, where our framework improves the performance of both a DeBERTa-base and BERT baseline. Our method performs best on the most challenging examples, achieving a new state-of-the-art for the ANLI round 3 test set. We outperform every baseline in a reduced-data setting, and despite using no annotations for the generated facts, our model predictions for individual facts align with human expectations.
- Published
- 2023
6. SPARSEFIT: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations
- Author
-
Solano, Jesus, Camburu, Oana-Maria, and Minervini, Pasquale
- Subjects
FOS: Computer and information sciences ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computation and Language (cs.CL) - Abstract
Explaining the decisions of neural models is crucial for ensuring their trustworthiness at deployment time. Using Natural Language Explanations (NLEs) to justify a model's predictions has recently gained increasing interest. However, this approach usually demands large datasets of human-written NLEs for the ground-truth answers, which are expensive and potentially infeasible for some applications. For models to generate high-quality NLEs when only a few NLEs are available, the fine-tuning of Pre-trained Language Models (PLMs) in conjunction with prompt-based learning recently emerged. However, PLMs typically have billions of parameters, making fine-tuning expensive. We propose SparseFit, a sparse few-shot fine-tuning strategy that leverages discrete prompts to jointly generate predictions and NLEs. We experiment with SparseFit on the T5 model and four datasets and compare it against state-of-the-art parameter-efficient fine-tuning techniques. We perform automatic and human evaluations to assess the quality of the model-generated NLEs, finding that fine-tuning only 6.8% of the model parameters leads to competitive results for both the task performance and the quality of the NLEs.
- Published
- 2023
7. Integration of Clinical Information and Imputed Aneuploidy Scores to Enhance Relapse Prediction in Early Stage Lung Cancer Patients
- Author
-
Timilsina, Mohan, Buosi, Samuele, Fey, Dirk, Janik, Adrianna, Torrente, Maria, Provencio, Mariano, Bermu´dez, Alberto Cruz, Carcereny, Enric, Costabello, Luca, Abreu, Delvys Rodr´ıguez, Cobo, Manuel, Castro, Rafael Lo´pez, Bernabe´, Reyes, Guirado, Dra. Maria, Minervini, Pasquale, and Nova´cˇek, V´ıt
- Subjects
Articles - Abstract
Early-stage lung cancer is crucial clinically due to its insidious nature and rapid progression. Most of the prediction models designed to predict tumour recurrence in the early stage of lung cancer rely on the clinical or medical history of the patient. However, their performance could likely be improved if the input patient data contained genomic information. Unfortunately, such data is not always collected. This is the main motivation of our work, in which we have imputed and integrated specific type of genomic data with clinical data to increase the accuracy of machine learning models for prediction of relapse in early-stage, non-small cell lung cancer patients. Using a publicly available TCGA lung adenocarcinoma cohort of 501 patients, their aneuploidy scores were imputed into similar records in the Spanish Lung Cancer Group (SLCG) data, more specifically a cohort of 1348 early-stage patients. First, the tumor recurrence in those patients was predicted without the imputed aneuploidy scores. Then, the SLCG data were enriched with the aneuploidy scores imputed from TCGA. This integrative approach improved the prediction of the relapse risk, achieving area under the precision-recall curve (PR-AUC) score of 0.74, and area under the ROC (ROC-AUC) score of 0.79. Using the prediction explanation model SHAP (SHapley Additive exPlanations), we further explained the predictions performed by the machine learning model. We conclude that our explainable predictive model is a promising tool for oncologists that addresses an unmet clinical need of post-treatment patient stratification based on the relapse risk, while also improving the predictive power by incorporating proxy genomic data not available for the actual specific patients.
- Published
- 2023
8. Adapting Neural Link Predictors for Data-Efficient Complex Query Answering
- Author
-
Arakelyan, Erik, Minervini, Pasquale, Daza, Daniel, Cochez, Michael, and Augenstein, Isabelle
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Logic in Computer Science ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Computer Science - Neural and Evolutionary Computing ,Neural and Evolutionary Computing (cs.NE) ,Machine Learning (cs.LG) ,Logic in Computer Science (cs.LO) - Abstract
Answering complex queries on incomplete knowledge graphs is a challenging task where a model needs to answer complex logical queries in the presence of missing knowledge. Prior work in the literature has proposed to address this problem by designing architectures trained end-to-end for the complex query answering task with a reasoning process that is hard to interpret while requiring data and resource-intensive training. Other lines of research have proposed re-using simple neural link predictors to answer complex queries, reducing the amount of training data by orders of magnitude while providing interpretable answers. The neural link predictor used in such approaches is not explicitly optimised for the complex query answering task, implying that its scores are not calibrated to interact together. We propose to address these problems via CQD$^{\mathcal{A}}$, a parameter-efficient score \emph{adaptation} model optimised to re-calibrate neural link prediction scores for the complex query answering task. While the neural link predictor is frozen, the adaptation component -- which only increases the number of model parameters by $0.03\%$ -- is trained on the downstream complex query answering task. Furthermore, the calibration component enables us to support reasoning over queries that include atomic negations, which was previously impossible with link predictors. In our experiments, CQD$^{\mathcal{A}}$ produces significantly more accurate results than current state-of-the-art methods, improving from $34.4$ to $35.1$ Mean Reciprocal Rank values averaged across all datasets and query types while using $\leq 30\%$ of the available training query types. We further show that CQD$^{\mathcal{A}}$ is data-efficient, achieving competitive results with only $1\%$ of the training complex queries, and robust in out-of-domain evaluations.
- Published
- 2023
9. Discovering Similarity and Dissimilarity Relations for Knowledge Propagation in Web Ontologies
- Author
-
Minervini, Pasquale, d’Amato, Claudia, Fanizzi, Nicola, and Tresp, Volker
- Published
- 2016
- Full Text
- View/download PDF
10. Machine Learning-Assisted Recurrence Prediction for Early-Stage Non-Small-Cell Lung Cancer Patients
- Author
-
Janik, Adrianna, Torrente, Maria, Costabello, Luca, Calvo, Virginia, Walsh, Brian, Camps, Carlos, Mohamed, Sameh K., Ortega, Ana L., Nováček, Vít, Massutí, Bartomeu, Minervini, Pasquale, Campelo, M. Rosario Garcia, del Barco, Edel, Bosch-Barrera, Joaquim, Menasalvas, Ernestina, Timilsina, Mohan, and Provencio, Mariano
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,FOS: Biological sciences ,Quantitative Biology - Quantitative Methods ,Quantitative Methods (q-bio.QM) ,Machine Learning (cs.LG) - Abstract
Background: Stratifying cancer patients according to risk of relapse can personalize their care. In this work, we provide an answer to the following research question: How to utilize machine learning to estimate probability of relapse in early-stage non-small-cell lung cancer patients? Methods: For predicting relapse in 1,387 early-stage (I-II), non-small-cell lung cancer (NSCLC) patients from the Spanish Lung Cancer Group data (65.7 average age, 24.8% females, 75.2% males) we train tabular and graph machine learning models. We generate automatic explanations for the predictions of such models. For models trained on tabular data, we adopt SHAP local explanations to gauge how each patient feature contributes to the predicted outcome. We explain graph machine learning predictions with an example-based method that highlights influential past patients. Results: Machine learning models trained on tabular data exhibit a 76% accuracy for the Random Forest model at predicting relapse evaluated with a 10-fold cross-validation (model was trained 10 times with different independent sets of patients in test, train and validation sets, the reported metrics are averaged over these 10 test sets). Graph machine learning reaches 68% accuracy over a 200-patient, held-out test set, calibrated on a held-out set of 100 patients. Conclusions: Our results show that machine learning models trained on tabular and graph data can enable objective, personalised and reproducible prediction of relapse and therefore, disease outcome in patients with early-stage NSCLC. With further prospective and multisite validation, and additional radiological and molecular data, this prognostic model could potentially serve as a predictive decision support tool for deciding the use of adjuvant treatments in early-stage lung cancer. Keywords: Non-Small-Cell Lung Cancer, Tumor Recurrence Prediction, Machine Learning
- Published
- 2022
11. An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks
- Author
-
Wu, Yuxiang, Zhao, Yu, Hu, Baotian, Minervini, Pasquale, Stenetorp, Pontus, and Riedel, Sebastian
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computation and Language (cs.CL) ,Machine Learning (cs.LG) - Abstract
Access to external knowledge is essential for many natural language processing tasks, such as question answering and dialogue. Existing methods often rely on a parametric model that stores knowledge in its parameters, or use a retrieval-augmented model that has access to an external knowledge source. Parametric and retrieval-augmented models have complementary strengths in terms of computational efficiency and predictive accuracy. To combine the strength of both approaches, we propose the Efficient Memory-Augmented Transformer (EMAT) -- it encodes external knowledge into a key-value memory and exploits the fast maximum inner product search for memory querying. We also introduce pre-training tasks that allow EMAT to encode informative key-value representations, and to learn an implicit strategy to integrate multiple memory slots into the transformer. Experiments on various knowledge-intensive tasks such as question answering and dialogue datasets show that, simply augmenting parametric models (T5-base) using our method produces more accurate results (e.g., 25.8 -> 44.3 EM on NQ) while retaining a high throughput (e.g., 1000 queries/s on NQ). Compared to retrieval-augmented models, EMAT runs substantially faster across the board and produces more accurate results on WoW and ELI5. Our code and datasets are available at https://github. com/uclnlp/EMAT., EMNLP 2022 main conference long paper. 8 pages, 6 figures
- Published
- 2022
12. Learning Discrete Directed Acyclic Graphs via Backpropagation
- Author
-
Wren, Andrew J., Minervini, Pasquale, Franceschi, Luca, and Zantedeschi, Valentina
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Machine Learning (cs.LG) - Abstract
Recently continuous relaxations have been proposed in order to learn Directed Acyclic Graphs (DAGs) from data by backpropagation, instead of using combinatorial optimization. However, a number of techniques for fully discrete backpropagation could instead be applied. In this paper, we explore that direction and propose DAG-DB, a framework for learning DAGs by Discrete Backpropagation. Based on the architecture of Implicit Maximum Likelihood Estimation [I-MLE, arXiv:2106.01798], DAG-DB adopts a probabilistic approach to the problem, sampling binary adjacency matrices from an implicit probability distribution. DAG-DB learns a parameter for the distribution from the loss incurred by each sample, performing competitively using either of two fully discrete backpropagation techniques, namely I-MLE and Straight-Through Estimation., 15 pages, 2 figures, 7 tables. Accepted for NeurIPS 2022 workshops on: Causal Machine Learning for Real-World Impact; and Neuro Causal and Symbolic AI
- Published
- 2022
13. Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models
- Author
-
Minervini, Pasquale, Franceschi, Luca, and Niepert, Mathias
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Neural and Evolutionary Computing ,Neural and Evolutionary Computing (cs.NE) ,Computation and Language (cs.CL) ,Machine Learning (cs.LG) - Abstract
The integration of discrete algorithmic components in deep learning architectures has numerous applications. Recently, Implicit Maximum Likelihood Estimation (IMLE, Niepert, Minervini, and Franceschi 2021), a class of gradient estimators for discrete exponential family distributions, was proposed by combining implicit differentiation through perturbation with the path-wise gradient estimator. However, due to the finite difference approximation of the gradients, it is especially sensitive to the choice of the finite difference step size, which needs to be specified by the user. In this work, we present Adaptive IMLE (AIMLE), the first adaptive gradient estimator for complex discrete distributions: it adaptively identifies the target distribution for IMLE by trading off the density of gradient information with the degree of bias in the gradient estimates. We empirically evaluate our estimator on synthetic examples, as well as on Learning to Explain, Discrete Variational Auto-Encoders, and Neural Relational Inference tasks. In our experiments, we show that our adaptive gradient estimator can produce faithful estimates while requiring orders of magnitude fewer samples than other gradient estimators., Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023)
- Published
- 2022
14. Machine Learning–Assisted Recurrence Prediction for Patients With Early-Stage Non–Small-Cell Lung Cancer.
- Author
-
Janik, Adrianna, Torrente, Maria, Costabello, Luca, Calvo, Virginia, Walsh, Brian, Camps, Carlos, Mohamed, Sameh K., Ortega, Ana L., Nováček, Vít, Massutí, Bartomeu, Minervini, Pasquale, Campelo, M. Rosario Garcia, del Barco, Edel, Bosch-Barrera, Joaquim, Menasalvas, Ernestina, Timilsina, Mohan, and Provencio, Mariano
- Subjects
NON-small-cell lung carcinoma ,DISEASE relapse ,MACHINE learning ,CANCER patients ,LUNG cancer ,RESEARCH questions ,PROGRESSION-free survival - Abstract
PURPOSE: Stratifying patients with cancer according to risk of relapse can personalize their care. In this work, we provide an answer to the following research question: How to use machine learning to estimate probability of relapse in patients with early-stage non–small-cell lung cancer (NSCLC)? MATERIALS AND METHODS: For predicting relapse in 1,387 patients with early-stage (I-II) NSCLC from the Spanish Lung Cancer Group data (average age 65.7 years, female 24.8%, male 75.2%), we train tabular and graph machine learning models. We generate automatic explanations for the predictions of such models. For models trained on tabular data, we adopt SHapley Additive exPlanations local explanations to gauge how each patient feature contributes to the predicted outcome. We explain graph machine learning predictions with an example-based method that highlights influential past patients. RESULTS: Machine learning models trained on tabular data exhibit a 76% accuracy for the random forest model at predicting relapse evaluated with a 10-fold cross-validation (the model was trained 10 times with different independent sets of patients in test, train, and validation sets, and the reported metrics are averaged over these 10 test sets). Graph machine learning reaches 68% accuracy over a held-out test set of 200 patients, calibrated on a held-out set of 100 patients. CONCLUSION: Our results show that machine learning models trained on tabular and graph data can enable objective, personalized, and reproducible prediction of relapse and, therefore, disease outcome in patients with early-stage NSCLC. With further prospective and multisite validation, and additional radiological and molecular data, this prognostic model could potentially serve as a predictive decision support tool for deciding the use of adjuvant treatments in early-stage lung cancer. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Efficient energy-based embedding models for link prediction in knowledge graphs
- Author
-
Minervini, Pasquale, d’Amato, Claudia, and Fanizzi, Nicola
- Published
- 2016
- Full Text
- View/download PDF
16. XQA-DST: Multi-Domain and Multi-Lingual Dialogue State Tracking
- Author
-
Zhou, Han, Iacobacci, Ignacio, and Minervini, Pasquale
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,Computation and Language (cs.CL) - Abstract
Dialogue State Tracking (DST), a crucial component of task-oriented dialogue (ToD) systems, keeps track of all important information pertaining to dialogue history: filling slots with the most probable values throughout the conversation. Existing methods generally rely on a predefined set of values and struggle to generalise to previously unseen slots in new domains. To overcome these challenges, we propose a domain-agnostic extractive question answering (QA) approach with shared weights across domains. To disentangle the complex domain information in ToDs, we train our DST with a novel domain filtering strategy by excluding out-of-domain question samples. With an independent classifier that predicts the presence of multiple domains given the context, our model tackles DST by extracting spans in active domains. Empirical results demonstrate that our model can efficiently leverage domain-agnostic QA datasets by two-stage fine-tuning while being both domain-scalable and open-vocabulary in DST. It shows strong transferability by achieving zero-shot domain-adaptation results on MultiWOZ 2.1 with an average JGA of 36.7%. It further achieves cross-lingual transfer with state-of-the-art zero-shot results, 66.2% JGA from English to German and 75.7% JGA from English to Italian on WOZ 2.0., Accepted to Findings of EACL 2023
- Published
- 2022
17. MedDistant19: Towards an Accurate Benchmark for Broad-Coverage Biomedical Relation Extraction
- Author
-
Amin, Saadullah, Minervini, Pasquale, Chang, David, Stenetorp, Pontus, and Neumann, Günter
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Computation and Language ,Computation and Language (cs.CL) ,Machine Learning (cs.LG) - Abstract
Relation extraction in the biomedical domain is challenging due to the lack of labeled data and high annotation costs, needing domain experts. Distant supervision is commonly used to tackle the scarcity of annotated data by automatically pairing knowledge graph relationships with raw texts. Such a pipeline is prone to noise and has added challenges to scale for covering a large number of biomedical concepts. We investigated existing broad-coverage distantly supervised biomedical relation extraction benchmarks and found a significant overlap between training and test relationships ranging from 26% to 86%. Furthermore, we noticed several inconsistencies in the data construction process of these benchmarks, and where there is no train-test leakage, the focus is on interactions between narrower entity types. This work presents a more accurate benchmark MedDistant19 for broad-coverage distantly supervised biomedical relation extraction that addresses these shortcomings and is obtained by aligning the MEDLINE abstracts with the widely used SNOMED Clinical Terms knowledge base. Lacking thorough evaluation with domain-specific language models, we also conduct experiments validating general domain relation extraction findings to biomedical relation extraction., Accepted by COLING 2022 (Oral presentation, Main Conference: Long Papers)
- Published
- 2022
18. Complex Query Answering with Neural Link Predictors (Extended Abstract)
- Author
-
Minervini, Pasquale, Arakelyan, Erik, Daza, Daniel, Cochez, Michael, De Raedt, Luc, Artificial intelligence, Network Institute, and Artificial Intelligence (section level)
- Abstract
Neural link predictors are useful for identifying missing edges in large scale Knowledge Graphs. However, it is still not clear how to use these models for answering more complex queries containing logical conjunctions (∧), disjunctions (∨), and existential quantifiers (∃). We propose a framework for efficiently answering complex queries on incomplete Knowledge Graphs. We translate each query into an end-to-end differentiable objective, where the truth value of each atom is computed by a pre-trained neural link predictor. We then analyse two solutions to the optimisation problem, including gradient-based and combinatorial search. In our experiments, the proposed approach produces more accurate results than state-of-the-art methods - black-box models trained on millions of generated queries - without the need for training on a large and diverse set of complex queries. Using orders of magnitude less training data, we obtain relative improvements ranging from 8% up to 40% in Hits@3 across multiple knowledge graphs. We find that it is possible to explain the outcome of our model in terms of the intermediate solutions identified for each of the complex query atoms. All our source code and datasets are available online.
- Published
- 2022
19. Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations
- Author
-
Chen, Yihong, Minervini, Pasquale, Riedel, Sebastian, and Stenetorp, Pontus
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,Artificial Intelligence (cs.AI) ,multi-relational ,Computer Science - Artificial Intelligence ,self-supervised learning objective ,Computation and Language (cs.CL) ,knowledge graph embeddings ,link prediction - Abstract
Learning good representations on multi-relational graphs is essential to knowledge base completion (KBC). In this paper, we propose a new self-supervised training objective for multi-relational graph representation learning, via simply incorporating relation prediction into the commonly used 1vsAll objective. The new training objective contains not only terms for predicting the subject and object of a given triple, but also a term for predicting the relation type. We analyse how this new objective impacts multi-relational learning in KBC: experiments on a variety of datasets and models show that relation prediction can significantly improve entity ranking, the most widely used evaluation task for KBC, yielding a 6.1% increase in MRR and 9.9% increase in Hits@1 on FB15k-237 as well as a 3.1% increase in MRR and 3.4% in Hits@1 on Aristo-v4. Moreover, we observe that the proposed objective is especially effective on highly multi-relational datasets, i.e. datasets with a large number of predicates, and generates better representations when larger embedding sizes are used., Comment: AKBC 2021
- Published
- 2021
- Full Text
- View/download PDF
20. Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions
- Author
-
Niepert, Mathias, Minervini, Pasquale, and Franceschi, Luca
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Machine Learning (cs.LG) - Abstract
Combining discrete probability distributions and combinatorial optimization problems with neural network components has numerous applications but poses several challenges. We propose Implicit Maximum Likelihood Estimation (I-MLE), a framework for end-to-end learning of models combining discrete exponential family distributions and differentiable neural components. I-MLE is widely applicable as it only requires the ability to compute the most probable states and does not rely on smooth relaxations. The framework encompasses several approaches such as perturbation-based implicit differentiation and recent methods to differentiate through black-box combinatorial solvers. We introduce a novel class of noise distributions for approximating marginals via perturb-and-MAP. Moreover, we show that I-MLE simplifies to maximum likelihood estimation when used in some recently studied learning settings that involve combinatorial solvers. Experiments on several datasets suggest that I-MLE is competitive with and often outperforms existing approaches which rely on problem-specific relaxations., NeurIPS 2021 camera-ready; repo: https://github.com/nec-research/tf-imle
- Published
- 2021
21. PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them
- Author
-
Lewis, Patrick, Wu, Yuxiang, Liu, Linqing, Minervini, Pasquale, K��ttler, Heinrich, Piktus, Aleksandra, Stenetorp, Pontus, and Riedel, Sebastian
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computation and Language (cs.CL) ,Machine Learning (cs.LG) - Abstract
Open-domain Question Answering models which directly leverage question-answer (QA) pairs, such as closed-book QA (CBQA) models and QA-pair retrievers, show promise in terms of speed and memory compared to conventional models which retrieve and read from text corpora. QA-pair retrievers also offer interpretable answers, a high degree of control, and are trivial to update at test time with new knowledge. However, these models lack the accuracy of retrieve-and-read systems, as substantially less knowledge is covered by the available QA-pairs relative to text corpora like Wikipedia. To facilitate improved QA-pair models, we introduce Probably Asked Questions (PAQ), a very large resource of 65M automatically-generated QA-pairs. We introduce a new QA-pair retriever, RePAQ, to complement PAQ. We find that PAQ preempts and caches test questions, enabling RePAQ to match the accuracy of recent retrieve-and-read models, whilst being significantly faster. Using PAQ, we train CBQA models which outperform comparable baselines by 5%, but trail RePAQ by over 15%, indicating the effectiveness of explicit retrieval. RePAQ can be configured for size (under 500MB) or speed (over 1K questions per second) whilst retaining high accuracy. Lastly, we demonstrate RePAQ's strength at selective QA, abstaining from answering when it is likely to be incorrect. This enables RePAQ to ``back-off" to a more expensive state-of-the-art model, leading to a combined system which is both more accurate and 2x faster than the state-of-the-art model alone.
- Published
- 2021
22. Grid-to-Graph: Flexible Spatial Relational Inductive Biases for Reinforcement Learning
- Author
-
Jiang, Zhengyao, Minervini, Pasquale, Jiang, Minqi, and Rocktaschel, Tim
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Machine Learning (cs.LG) - Abstract
Although reinforcement learning has been successfully applied in many domains in recent years, we still lack agents that can systematically generalize. While relational inductive biases that fit a task can improve generalization of RL agents, these biases are commonly hard-coded directly in the agent's neural architecture. In this work, we show that we can incorporate relational inductive biases, encoded in the form of relational graphs, into agents. Based on this insight, we propose Grid-to-Graph (GTG), a mapping from grid structures to relational graphs that carry useful spatial relational inductive biases when processed through a Relational Graph Convolution Network (R-GCN). We show that, with GTG, R-GCNs generalize better both in terms of in-distribution and out-of-distribution compared to baselines based on Convolutional Neural Networks and Neural Logic Machines on challenging procedurally generated environments and MinAtar. Furthermore, we show that GTG produces agents that can jointly reason over observations and environment dynamics encoded in knowledge bases., Accepted by AAMAS 2021
- Published
- 2021
23. Complex Query Answering with Neural Link Predictors
- Author
-
Arakelyan, Erik, Daza, Daniel, Minervini, Pasquale, and Cochez, Michael
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Logic in Computer Science ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Computer Science - Neural and Evolutionary Computing ,Neural and Evolutionary Computing (cs.NE) ,Machine Learning (cs.LG) ,Logic in Computer Science (cs.LO) - Abstract
Neural link predictors are immensely useful for identifying missing edges in large scale Knowledge Graphs. However, it is still not clear how to use these models for answering more complex queries that arise in a number of domains, such as queries using logical conjunctions ($\land$), disjunctions ($\lor$) and existential quantifiers ($\exists$), while accounting for missing edges. In this work, we propose a framework for efficiently answering complex queries on incomplete Knowledge Graphs. We translate each query into an end-to-end differentiable objective, where the truth value of each atom is computed by a pre-trained neural link predictor. We then analyse two solutions to the optimisation problem, including gradient-based and combinatorial search. In our experiments, the proposed approach produces more accurate results than state-of-the-art methods -- black-box neural models trained on millions of generated queries -- without the need of training on a large and diverse set of complex queries. Using orders of magnitude less training data, we obtain relative improvements ranging from 8% up to 40% in Hits@3 across different knowledge graphs containing factual information. Finally, we demonstrate that it is possible to explain the outcome of our model in terms of the intermediate solutions identified for each of the complex query atoms. All our source code and datasets are available online, at https://github.com/uclnlp/cqd., Proceedings of the Ninth International Conference on Learning Representations (ICLR 2021, oral presentation)
- Published
- 2020
24. Knowledge Graph Embeddings and Explainable AI
- Author
-
Bianchi Federico, Rossiello Gaetano, Costabello Luca, Palmonari Matteo, Minervini Pasquale, Tiddi, I, Lécué, F, Hitzler, P, Bianchi, F, Rossiello, G, Costabello, L, Palmonari, M, and Minervini, P
- Subjects
Knowledge Graph ,Knowledge Graph Embedding ,Knowledge Representation ,eXplainable AI - Abstract
Knowledge graph embeddings are now a widely adopted approach to knowledge representation in which entities and relationships are embedded in vector spaces. In this chapter, we introduce the reader to the concept of knowledge graph embeddings by explaining what they are, how they can be generated and how they can be evaluated. We summarize the state-of-the-art in this field by describing the approaches that have been introduced to represent knowledge in the vector space. In relation to knowledge representation, we consider the problem of explainability, and discuss models and methods for explaining predictions obtained via knowledge graph embeddings.
- Published
- 2020
25. Neural Variational Inference For Estimating Uncertainty in Knowledge Graph Embeddings
- Author
-
Cowen-Rivers, Alexander I., Minervini, Pasquale, Rocktaschel, Tim, Bosnjak, Matko, Riedel, Sebastian, and Wang, Jun
- Subjects
FOS: Computer and information sciences ,Computer Science - Symbolic Computation ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Symbolic Computation (cs.SC) ,Machine Learning (cs.LG) - Abstract
Recent advances in Neural Variational Inference allowed for a renaissance in latent variable models in a variety of domains involving high-dimensional data. While traditional variational methods derive an analytical approximation for the intractable distribution over the latent variables, here we construct an inference network conditioned on the symbolic representation of entities and relation types in the Knowledge Graph, to provide the variational distributions. The new framework results in a highly-scalable method. Under a Bernoulli sampling framework, we provide an alternative justification for commonly used techniques in large-scale stochastic variational inference, which drastically reduce training time at a cost of an additional approximation to the variational lower bound. We introduce two models from this highly scalable probabilistic framework, namely the Latent Information and Latent Fact models, for reasoning over knowledge graph-based representations. Our Latent Information and Latent Fact models improve upon baseline performance under certain conditions. We use the learnt embedding variance to estimate predictive uncertainty during link prediction, and discuss the quality of these learnt uncertainty estimates. Our source code and datasets are publicly available online at https://github.com/alexanderimanicowenrivers/Neural-Variational-Knowledge-Graphs., Accepted at IJCAI 19 Neural-Symbolic Learning and Reasoning Workshop
- Published
- 2019
26. Adversarially Regularising Neural NLI Models to Integrate Logical Background Knowledge
- Author
-
Minervini, Pasquale and Riedel, Sebastian
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Computation and Language (cs.CL) ,Machine Learning (cs.LG) - Abstract
Adversarial examples are inputs to machine learning models designed to cause the model to make a mistake. They are useful for understanding the shortcomings of machine learning models, interpreting their results, and for regularisation. In NLP, however, most example generation strategies produce input text by using known, pre-specified semantic transformations, requiring significant manual effort and in-depth understanding of the problem and domain. In this paper, we investigate the problem of automatically generating adversarial examples that violate a set of given First-Order Logic constraints in Natural Language Inference (NLI). We reduce the problem of identifying such adversarial examples to a combinatorial optimisation problem, by maximising a quantity measuring the degree of violation of such constraints and by using a language model for generating linguistically-plausible examples. Furthermore, we propose a method for adversarially regularising neural NLI models for incorporating background knowledge. Our results show that, while the proposed method does not always improve results on the SNLI and MultiNLI datasets, it significantly and consistently increases the predictive accuracy on adversarially-crafted datasets -- up to a 79.6% relative improvement -- while drastically reducing the number of background knowledge violations. Furthermore, we show that adversarial examples transfer among model architectures, and that the proposed adversarial training procedure improves the robustness of NLI models to adversarial examples., Accepted at the SIGNLL Conference on Computational Natural Language Learning (CoNLL 2018)
- Published
- 2018
27. Extrapolation in NLP
- Author
-
Mitchell, Jeff, Minervini, Pasquale, Stenetorp, Pontus, and Riedel, Sebastian
- Subjects
Computer Science - Computation and Language - Abstract
We argue that extrapolation to examples outside the training space will often be easier for models that capture global structures, rather than just maximise their local fit to the training data. We show that this is true for two popular models: the Decomposable Attention Model and word2vec.
- Published
- 2018
28. Apertium goes SOA: an efficient and scalable service based on the Apertium rule-based machine translation platform
- Author
-
Minervini, Pasquale
- Subjects
Lenguajes y Sistemas Informáticos ,Service Oriented Architecture ,Machine translation ,Apertium - Abstract
Service Oriented Architecture (SOA) is a paradigm for organising and using distributed services that may be under the control of different ownership domains and implemented using various technology stacks. In some contexts, an organisation using an IT infrastructure implementing the SOA paradigm can take a great benefit from the integration, in its business processes, of efficient machine translation (MT) services to overcome language barriers. This paper describes the architecture and the design patterns used to develop an MT service that is efficient, scalable and easy to integrate in new and existing business processes. The service is based on Apertium, a free/open-source rule-based machine translation platform. Development for this project was funded as part of the Google Summer of Code.
- Published
- 2009
29. Adaptive Knowledge Propagation in Web Ontologies.
- Author
-
MINERVINI, PASQUALE, TRESP, VOLKER, D'AMATO, CLAUDIA, and FANIZZI, NICOLA
- Subjects
ASSISTIVE computer technology ,KNOWLEDGE management ,WEB services ,HOMOPHILY theory (Communication) ,SPARSE approximations - Abstract
We focus on the problem of predicting missing assertions in Web ontologies. We start from the assumption that individual resources that are similar in some aspects are more likely to be linked by specific relations: this phenomenon is also referred to as homophily and emerges in a variety of relational domains. In this article, we propose a method for (1) identifying which relations in the ontology are more likely to link similar individuals and (2) efficiently propagating knowledge across chains of similar individuals. By enforcing sparsity in the model parameters, the proposed method is able to select only the most relevant relations for a given prediction task. Our experimental evaluation demonstrates the effectiveness of the proposed method in comparison to state-of-the-art methods from the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
30. Scalable Learning of Entity and Predicate Embeddings for Knowledge Graph Completion.
- Author
-
Minervini, Pasquale, Fanizzi, Nicola, d'Amato, Claudia, and Esposito, Floriana
- Published
- 2015
- Full Text
- View/download PDF
31. A Gaussian Process Model for Knowledge Propagation in Web Ontologies.
- Author
-
Minervini, Pasquale, dAmato, Claudia, Fanizzi, Nicola, and Esposito, Floriana
- Published
- 2014
- Full Text
- View/download PDF
32. Learning Probabilistic Description Logic Concepts Under Alternative Assumptions on Incompleteness.
- Author
-
Minervini, Pasquale, d'Amato, Claudia, Fanizzi, Nicola, and Esposito, Floriana
- Published
- 2014
- Full Text
- View/download PDF
33. Graph-Based Regularization for Transductive Class-Membership Prediction.
- Author
-
Minervini, Pasquale, d'Amato, Claudia, Fanizzi, Nicola, and Esposito, Floriana
- Published
- 2014
- Full Text
- View/download PDF
34. Adaptive Knowledge Propagation in Web Ontologies.
- Author
-
Minervini, Pasquale, d'Amato, Claudia, Fanizzi, Nicola, and Esposito, Floriana
- Published
- 2014
- Full Text
- View/download PDF
35. Learning probabilistic Description logic concepts.
- Author
-
Minervini, Pasquale, d'Amato, Claudia, and Fanizzi, Nicola
- Published
- 2012
- Full Text
- View/download PDF
36. NUMERIC PREDICTION ON OWL KNOWLEDGE BASES THROUGH TERMINOLOGICAL REGRESSION TREES.
- Author
-
FANIZZI, NICOLA, D'AMATO, CLAUDIA, ESPOSITO, FLORIANA, and MINERVINI, PASQUALE
- Subjects
SEMANTIC computing ,DATA modeling ,KNOWLEDGE base ,REGRESSION trees ,PREDICTION models ,RELATIONAL databases - Abstract
In the context of semantic knowledge bases, among the possible problems that may be tackled by means of data-driven inductive strategies, one can consider those that require the prediction of the unknown values of existing numeric features or the definition of new features to be derived from the data model. These problems can be cast as regression problems so that suitable solutions can be devised based on those found for multi-relational databases. In this paper, a new framework for the induction of logical regression trees is presented. Differently from the classic logical regression trees and the recent fork of the terminological classification trees, the novel terminological regression trees aim at predicting continuous values, while tests at the tree nodes are expressed with Description Logic concepts. They are intended for multiple uses with knowledge bases expressed in the standard ontology languages for the Semantic Web. A top-down method for growing such trees is proposed as well as algorithms for making predictions with the trees and deriving rules. The system that implements these methods is experimentally evaluated on ontologies selected from popular repositories. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
37. Knowledge graph embeddings in the biomedical domain: are they useful? A look at link prediction, rule learning, and downstream polypharmacy tasks.
- Author
-
Gema AP, Grabarczyk D, De Wulf W, Borole P, Alfaro JA, Minervini P, Vergari A, and Rajan A
- Abstract
Summary: Knowledge graphs (KGs) are powerful tools for representing and organizing complex biomedical data. They empower researchers, physicians, and scientists by facilitating rapid access to biomedical information, enabling the discernment of patterns or insights, and fostering the formulation of decisions and the generation of novel knowledge. To automate these activities, several KG embedding algorithms have been proposed to learn from and complete KGs. However, the efficacy of these embedding algorithms appears limited when applied to biomedical KGs, prompting questions about whether they can be useful in this field. To that end, we explore several widely used KG embedding models and evaluate their performance and applications using a recent biomedical KG, BioKG. We also demonstrate that by using recent best practices for training KG embeddings, it is possible to improve performance over BioKG. Additionally, we address interpretability concerns that naturally arise with such machine learning methods. In particular, we examine rule-based methods that aim to address these concerns by making interpretable predictions using learned rules, achieving comparable performance. Finally, we discuss a realistic use case where a pretrained BioKG embedding is further trained for a specific task, in this case, four polypharmacy scenarios where the goal is to predict missing links or entities in another downstream KGs in four polypharmacy scenarios. We conclude that in the right scenarios, biomedical KG embeddings can be effective and useful., Availability and Implementation: Our code and data is available at https://github.com/aryopg/biokge., Competing Interests: None declared., (© The Author(s) 2024. Published by Oxford University Press.)
- Published
- 2024
- Full Text
- View/download PDF
38. Integration of Clinical Information and Imputed Aneuploidy Scores to Enhance Relapse Prediction in Early Stage Lung Cancer Patients.
- Author
-
Timilsina M, Buosi S, Fey D, Janik A, Torrente M, Provencio M, Bermu Dez AC, Carcereny E, Costabello L, Abreu DRI, Cobo M, Castro RLP, Bernabe R, Guirado DM, Minervini P, and Nova Cˇek VI
- Subjects
- Humans, Neoplasm Recurrence, Local, Genomics, Lung Neoplasms, Carcinoma, Non-Small-Cell Lung genetics
- Abstract
Early-stage lung cancer is crucial clinically due to its insidious nature and rapid progression. Most of the prediction models designed to predict tumour recurrence in the early stage of lung cancer rely on the clinical or medical history of the patient. However, their performance could likely be improved if the input patient data contained genomic information. Unfortunately, such data is not always collected. This is the main motivation of our work, in which we have imputed and integrated specific type of genomic data with clinical data to increase the accuracy of machine learning models for prediction of relapse in early-stage, non-small cell lung cancer patients. Using a publicly available TCGA lung adenocarcinoma cohort of 501 patients, their aneuploidy scores were imputed into similar records in the Spanish Lung Cancer Group (SLCG) data, more specifically a cohort of 1348 early-stage patients. First, the tumor recurrence in those patients was predicted without the imputed aneuploidy scores. Then, the SLCG data were enriched with the aneuploidy scores imputed from TCGA. This integrative approach improved the prediction of the relapse risk, achieving area under the precision-recall curve (PR-AUC) score of 0.74, and area under the ROC (ROC-AUC) score of 0.79. Using the prediction explanation model SHAP (SHapley Additive exPlanations), we further explained the predictions performed by the machine learning model. We conclude that our explainable predictive model is a promising tool for oncologists that addresses an unmet clinical need of post-treatment patient stratification based on the relapse risk, while also improving the predictive power by incorporating proxy genomic data not available for the actual specific patients., (©2022 AMIA - All rights reserved.)
- Published
- 2023
39. On Predicting Recurrence in Early Stage Non-small Cell Lung Cancer.
- Author
-
Mohamed SK, Walsh B, Timilsina M, Torrente M, Franco F, Provencio M, Janik A, Costabello L, Minervini P, Stenetorp P, and Novácˇek V
- Subjects
- Humans, Neoplasm Staging, Nomograms, Prognosis, Retrospective Studies, Carcinoma, Non-Small-Cell Lung diagnosis, Carcinoma, Non-Small-Cell Lung pathology, Lung Neoplasms diagnosis
- Abstract
Early detection and mitigation of disease recurrence in non-small cell lung cancer (NSCLC) patients is a nontrivial problem that is typically addressed either by rather generic follow-up screening guidelines, self-reporting, simple nomograms, or by models that predict relapse risk in individual patients using statistical analysis of retrospective data. We posit that machine learning models trained on patient data can provide an alternative approach that allows for more efficient development of many complementary models at once, superior accuracy, less dependency on the data collection protocols and increased support for explainability of the predictions. In this preliminary study, we describe an experimental suite of various machine learning models applied on a patient cohort of 2442 early stage NSCLC patients. We discuss the promising results achieved, as well as the lessons we learned while developing this baseline for further, more advanced studies in this area., (©2021 AMIA - All rights reserved.)
- Published
- 2022
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.