Author: "Josifoski, Martin" / Language: undetermined - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Josifoski, Martin"' showing total 4 results

Start Over Author "Josifoski, Martin" Language undetermined

4 results on '"Josifoski, Martin"'

1. Generating Faithful Synthetic Data with Large Language Models: A Case Study in Computational Social Science

Author: Veselovsky, Veniamin, Ribeiro, Manoel Horta, Arora, Akhil, Josifoski, Martin, Anderson, Ashton, and West, Robert
Subjects: FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)
Abstract: Large Language Models (LLMs) have democratized synthetic data generation, which in turn has the potential to simplify and broaden a wide gamut of NLP tasks. Here, we tackle a pervasive problem in synthetic data generation: its generative distribution often differs from the distribution of real-world data researchers care about (in other words, it is unfaithful). In a case study on sarcasm detection, we study three strategies to increase the faithfulness of synthetic data: grounding, filtering, and taxonomy-based generation. We evaluate these strategies using the performance of classifiers trained with generated synthetic data on real-world data. While all three strategies improve the performance of classifiers, we find that grounding works best for the task at hand. As synthetic data generation plays an ever-increasing role in NLP research, we expect this work to be a stepping stone in improving its utility. We conclude this paper with some recommendations on how to generate high(er)-fidelity synthetic data for specific tasks., Comment: 8 pages
Published: 2023
Full Text: View/download PDF

2. PAC-Bayesian Meta-Learning: From Theory to Practice

Author: Rothfuss, Jonas, Josifoski, Martin, Fortuin, Vincent, and Krause, Andreas
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Meta-Learning aims to accelerate the learning on new tasks by acquiring useful inductive biases from related data sources. In practice, the number of tasks available for meta-learning is often small. Yet, most of the existing approaches rely on an abundance of meta-training tasks, making them prone to overfitting. How to regularize the meta-learner to ensure generalization to unseen tasks, is a central question in the literature. We provide a theoretical analysis using the PAC-Bayesian framework and derive the first bound for meta-learners with unbounded loss functions. Crucially, our bounds allow us to derive the PAC-optimal hyper-posterior (PACOH) - the closed-form-solution of the PAC-Bayesian meta-learning problem, thereby avoiding the reliance on nested optimization, giving rise to an optimization problem amenable to standard variational methods that scale well. Our experiments show that, when instantiating the PACOH with Gaussian processes and Bayesian Neural Networks as base learners, the resulting methods are more scalable, and yield state-of-the-art performance, both in terms of predictive accuracy and the quality of uncertainty estimates. Finally, thanks to the principled treatment of uncertainty, our meta-learners can also be successfully employed for sequential decision problems., Comment: 50 pages
Published: 2022
Full Text: View/download PDF

3. Invariant Language Modeling

Author: Peyrard, Maxime, Ghotra, Sarvjeet Singh, Josifoski, Martin, Agarwal, Vidhan, Patra, Barun, Carignan, Dean, Kiciman, Emre, and West, Robert
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computation and Language, Computation and Language (cs.CL), Machine Learning (cs.LG)
Abstract: Large pretrained language models are critical components of modern NLP pipelines. Yet, they suffer from spurious correlations, poor out-of-domain generalization, and biases. Inspired by recent progress in causal machine learning, in particular the invariant risk minimization (IRM) paradigm, we propose invariant language modeling, a framework for learning invariant representations that generalize better across multiple environments. In particular, we adapt a game-theoretic formulation of IRM (IRM-games) to language models, where the invariance emerges from a specific training schedule in which all the environments compete to optimize their own environment-specific loss by updating subsets of the model in a round-robin fashion. We focus on controlled experiments to precisely demonstrate the ability of our method to (i) remove structured noise, (ii) ignore specific spurious correlations without affecting global performance, and (iii) achieve better out-of-domain generalization. These benefits come with a negligible computational overhead compared to standard training, do not require changing the local loss, and can be applied to any language model. We believe this framework is promising to help mitigate spurious correlations and biases in language models., Comment: Published at EMNLP 2022
Published: 2021
Full Text: View/download PDF

4. Language Model Decoding as Likelihood–Utility Alignment

Author: Josifoski, Martin, Peyrard, Maxime, Rajic, Frano, Wei, Jiheng, Paul, Debjit, Hartmann, Valentin, Patra, Barun, Chaudhary, Vishrav, Kiciman, Emre, Faltings, Boi, and West, Robert
Abstract: A critical component of a successful language generation pipeline is the decoding algorithm. However, the general principles that should guide the choice of a decoding algorithm re- main unclear. Previous works only compare decoding algorithms in narrow scenarios, and their findings do not generalize across tasks. We argue that the misalignment between the model’s likelihood and the task-specific notion of utility is the key factor to understanding the effectiveness of decoding algorithms. To struc- ture the discussion, we introduce a taxonomy of misalignment mitigation strategies (MMSs), providing a unifying view of decoding as a tool for alignment. The MMS taxonomy groups decoding algorithms based on their implicit assumptions about likelihood–utility misalign- ment, yielding general statements about their applicability across tasks. Specifically, by an- alyzing the correlation between the likelihood and the utility of predictions across a diverse set of tasks, we provide empirical evidence supporting the proposed taxonomy and a set of principles to structure reasoning when choos- ing a decoding algorithm. Crucially, our analy- sis is the first to relate likelihood-based decod- ing algorithms with algorithms that rely on ex- ternal information, such as value-guided meth- ods and prompting, and covers the most di- verse set of tasks to date. Code, data, and models are available at https://github.com/epfl- dlab/understanding-decoding.

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

4 results on '"Josifoski, Martin"'

1. Generating Faithful Synthetic Data with Large Language Models: A Case Study in Computational Social Science

2. PAC-Bayesian Meta-Learning: From Theory to Practice

3. Invariant Language Modeling

4. Language Model Decoding as Likelihood–Utility Alignment

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Database

4 results on '"Josifoski, Martin"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources