Search

Your search keyword '"Durmus, Esin"' showing total 70 results

Search Constraints

Start Over You searched for: Author "Durmus, Esin" Remove constraint Author: "Durmus, Esin"
70 results on '"Durmus, Esin"'

Search Results

1. Collective Constitutional AI: Aligning a Language Model with Public Input

2. NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps

3. Evaluating and Mitigating Discrimination in Language Model Decisions

4. Specific versus General Principles for Constitutional AI

5. Towards Understanding Sycophancy in Language Models

6. Studying Large Language Model Generalization with Influence Functions

7. Question Decomposition Improves the Faithfulness of Model-Generated Reasoning

8. Measuring Faithfulness in Chain-of-Thought Reasoning

9. Towards Measuring the Representation of Subjective Global Opinions in Language Models

10. Opportunities and Risks of LLMs for Scalable Deliberation with Polis

11. Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models

12. Whose Opinions Do Language Models Reflect?

13. Benchmarking Large Language Models for News Summarization

14. Contrastive Error Attribution for Finetuned Language Models

15. Evaluating Human-Language Model Interaction

16. Holistic Evaluation of Language Models

17. Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale

18. GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

19. Spurious Correlations in Reference-Free Evaluation of Text Generation

20. Language modeling via stochastic processes

21. Towards Understanding Persuasion in Computational Argumentation

22. Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization

23. On the Opportunities and Risks of Foundation Models

24. The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics

25. Exploring the Role of Argument Structure in Online Debate Persuasion

26. WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization

27. FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization

28. The Role of Pragmatic and Discourse Context in Determining Argument Impact

29. Determining Relative Argument Specificity and Stance for Complex Argumentative Structures

30. Exploring the Role of Prior Beliefs for Argument Persuasion

31. A Corpus for Modeling User and Language Effects in Argumentation on Online Debating

39. GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

43. The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics

45. Topic and Emotion Development among Dutch COVID-19 Twitter Communities in the early Pandemic

46. Matching Theory and Data with Personal-ITY: What a Corpus of Italian YouTube Comments Reveals About Personality

Catalog

Books, media, physical & digital resources