980 results on '"Stein, Benno"'
Search Results
2. Are Large Language Models Reliable Argument Quality Annotators?
- Author
-
Mirzakhmedova, Nailia, Gohsen, Marcel, Chang, Chia Hao, and Stein, Benno
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Emerging Technologies - Abstract
Evaluating the quality of arguments is a crucial aspect of any system leveraging argument mining. However, it is a challenge to obtain reliable and consistent annotations regarding argument quality, as this usually requires domain-specific expertise of the annotators. Even among experts, the assessment of argument quality is often inconsistent due to the inherent subjectivity of this task. In this paper, we study the potential of using state-of-the-art large language models (LLMs) as proxies for argument quality annotators. To assess the capability of LLMs in this regard, we analyze the agreement between model, human expert, and human novice annotators based on an established taxonomy of argument quality dimensions. Our findings highlight that LLMs can produce consistent annotations, with a moderately high agreement with human experts across most of the quality dimensions. Moreover, we show that using LLMs as additional annotators can significantly improve the agreement between annotators. These results suggest that LLMs can serve as a valuable tool for automated argument quality assessment, thus streamlining and accelerating the evaluation of large argument datasets., Comment: 18 pages, 5 figures, 5 tables
- Published
- 2024
3. If there's a Trigger Warning, then where's the Trigger? Investigating Trigger Warnings at the Passage Level
- Author
-
Wiegmann, Matti, Rakete, Jennifer, Wolska, Magdalena, Stein, Benno, and Potthast, Martin
- Subjects
Computer Science - Computation and Language ,Computer Science - Computers and Society - Abstract
Trigger warnings are labels that preface documents with sensitive content if this content could be perceived as harmful by certain groups of readers. Since warnings about a document intuitively need to be shown before reading it, authors usually assign trigger warnings at the document level. What parts of their writing prompted them to assign a warning, however, remains unclear. We investigate for the first time the feasibility of identifying the triggering passages of a document, both manually and computationally. We create a dataset of 4,135 English passages, each annotated with one of eight common trigger warnings. In a large-scale evaluation, we then systematically evaluate the effectiveness of fine-tuned and few-shot classifiers, and their generalizability. We find that trigger annotation belongs to the group of subjective annotation tasks in NLP, and that automatic trigger classification remains challenging but feasible.
- Published
- 2024
4. Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders
- Author
-
Schlatt, Ferdinand, Fröbe, Maik, Scells, Harrisen, Zhuang, Shengyao, Koopman, Bevan, Zuccon, Guido, Stein, Benno, Potthast, Martin, and Hagen, Matthias
- Subjects
Computer Science - Information Retrieval - Abstract
Existing cross-encoder re-rankers can be categorized as pointwise, pairwise, or listwise models. Pair- and listwise models allow passage interactions, which usually makes them more effective than pointwise models but also less efficient and less robust to input order permutations. To enable efficient permutation-invariant passage interactions during re-ranking, we propose a new cross-encoder architecture with inter-passage attention: the Set-Encoder. In Cranfield-style experiments on TREC Deep Learning and TIREx, the Set-Encoder is as effective as state-of-the-art listwise models while improving efficiency and robustness to input permutations. Interestingly, a pointwise model is similarly effective, but when additionally requiring the models to consider novelty, the Set-Encoder is more effective than its pointwise counterpart and retains its advantageous properties compared to other listwise models. Our code and models are publicly available at https://github.com/webis-de/set-encoder.
- Published
- 2024
5. Task-Oriented Paraphrase Analytics
- Author
-
Gohsen, Marcel, Hagen, Matthias, Potthast, Martin, and Stein, Benno
- Subjects
Computer Science - Computation and Language - Abstract
Since paraphrasing is an ill-defined task, the term "paraphrasing" covers text transformation tasks with different characteristics. Consequently, existing paraphrasing studies have applied quite different (explicit and implicit) criteria as to when a pair of texts is to be considered a paraphrase, all of which amount to postulating a certain level of semantic or lexical similarity. In this paper, we conduct a literature review and propose a taxonomy to organize the 25~identified paraphrasing (sub-)tasks. Using classifiers trained to identify the tasks that a given paraphrasing instance fits, we find that the distributions of task-specific instances in the known paraphrase corpora vary substantially. This means that the use of these corpora, without the respective paraphrase conditions being clearly defined (which is the normal case), must lead to incomparable and misleading results., Comment: Accepted at LREC-COLING 2024
- Published
- 2024
6. Detecting Generated Native Ads in Conversational Search
- Author
-
Schmidt, Sebastian, Zelch, Ines, Bevendorff, Janek, Stein, Benno, Hagen, Matthias, and Potthast, Martin
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language - Abstract
Conversational search engines such as YouChat and Microsoft Copilot use large language models (LLMs) to generate responses to queries. It is only a small step to also let the same technology insert ads within the generated responses - instead of separately placing ads next to a response. Inserted ads would be reminiscent of native advertising and product placement, both of which are very effective forms of subtle and manipulative advertising. Considering the high computational costs associated with LLMs, for which providers need to develop sustainable business models, users of conversational search engines may very well be confronted with generated native ads in the near future. In this paper, we thus take a first step to investigate whether LLMs can also be used as a countermeasure, i.e., to block generated native ads. We compile the Webis Generated Native Ads 2024 dataset of queries and generated responses with automatically inserted ads, and evaluate whether LLMs or fine-tuned sentence transformers can detect the ads. In our experiments, the investigated LLMs struggle with the task but sentence transformers achieve precision and recall values above 0.9., Comment: WWW'24 Short Papers Track; 4 pages
- Published
- 2024
- Full Text
- View/download PDF
7. Argumentation in Waltz's 'Emerging Structure of International Politics'
- Author
-
Wolska, Magdalena, Fröhlich, Bernd, Girgensohn, Katrin, Gholiagha, Sassan, Kiesel, Dora, Neyer, Jürgen, Riehmann, Patrick, Sienknecht, Mitja, and Stein, Benno
- Subjects
Computer Science - Computation and Language - Abstract
We present an annotation scheme for argumentative and domain-specific aspects of scholarly articles on the theory of International Relations. At argumentation level we identify Claims and Support/Attack relations. At domain level we model discourse content in terms of Theory and Data-related statements. We annotate Waltz's 1993 text on structural realism and show that our scheme can be reliably applied by domain experts enables insights on two research questions on justifications of claims., Comment: 9 pages
- Published
- 2023
8. Evaluating Generative Ad Hoc Information Retrieval
- Author
-
Gienapp, Lukas, Scells, Harrisen, Deckers, Niklas, Bevendorff, Janek, Wang, Shuai, Kiesel, Johannes, Syed, Shahbaz, Fröbe, Maik, Zuccon, Guido, Stein, Benno, Hagen, Matthias, and Potthast, Martin
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language - Abstract
Recent advances in large language models have enabled the development of viable generative retrieval systems. Instead of a traditional document ranking, generative retrieval systems often directly return a grounded generated text as a response to a query. Quantifying the utility of the textual responses is essential for appropriately evaluating such generative ad hoc retrieval. Yet, the established evaluation methodology for ranking-based ad hoc retrieval is not suited for the reliable and reproducible evaluation of generated responses. To lay a foundation for developing new evaluation methods for generative retrieval systems, we survey the relevant literature from the fields of information retrieval and natural language processing, identify search tasks and system architectures in generative retrieval, develop a new user model, and study its operationalization., Comment: 14 pages, 6 figures, 1 table. Published at SIGIR'24 perspective paper track
- Published
- 2023
- Full Text
- View/download PDF
9. Drawing the Same Bounding Box Twice? Coping Noisy Annotations in Object Detection with Repeated Labels
- Author
-
Tschirschwitz, David, Benz, Christian, Florek, Morris, Norderhus, Henrik, Stein, Benno, and Rodehorst, Volker
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The reliability of supervised machine learning systems depends on the accuracy and availability of ground truth labels. However, the process of human annotation, being prone to error, introduces the potential for noisy labels, which can impede the practicality of these systems. While training with noisy labels is a significant consideration, the reliability of test data is also crucial to ascertain the dependability of the results. A common approach to addressing this issue is repeated labeling, where multiple annotators label the same example, and their labels are combined to provide a better estimate of the true label. In this paper, we propose a novel localization algorithm that adapts well-established ground truth estimation methods for object detection and instance segmentation tasks. The key innovation of our method lies in its ability to transform combined localization and classification tasks into classification-only problems, thus enabling the application of techniques such as Expectation-Maximization (EM) or Majority Voting (MJV). Although our main focus is the aggregation of unique ground truth for test data, our algorithm also shows superior performance during training on the TexBiG dataset, surpassing both noisy label training and label aggregation using Weighted Boxes Fusion (WBF). Our experiments indicate that the benefits of repeated labels emerge under specific dataset and annotation configurations. The key factors appear to be (1) dataset complexity, the (2) annotator consistency, and (3) the given annotation budget constraints.
- Published
- 2023
10. The Information Retrieval Experiment Platform
- Author
-
Fröbe, Maik, Reimer, Jan Heinrich, MacAvaney, Sean, Deckers, Niklas, Reich, Simon, Bevendorff, Janek, Stein, Benno, Hagen, Matthias, and Potthast, Martin
- Subjects
Computer Science - Information Retrieval - Abstract
We integrate ir_datasets, ir_measures, and PyTerrier with TIRA in the Information Retrieval Experiment Platform (TIREx) to promote more standardized, reproducible, scalable, and even blinded retrieval experiments. Standardization is achieved when a retrieval approach implements PyTerrier's interfaces and the input and output of an experiment are compatible with ir_datasets and ir_measures. However, none of this is a must for reproducibility and scalability, as TIRA can run any dockerized software locally or remotely in a cloud-native execution environment. Version control and caching ensure efficient (re)execution. TIRA allows for blind evaluation when an experiment runs on a remote server or cloud not under the control of the experimenter. The test data and ground truth are then hidden from public access, and the retrieval software has to process them in a sandbox that prevents data leaks. We currently host an instance of TIREx with 15 corpora (1.9 billion documents) on which 32 shared retrieval tasks are based. Using Docker images of 50 standard retrieval approaches, we automatically evaluated all approaches on all tasks (50 $\cdot$ 32 = 1,600~runs) in less than a week on a midsize cluster (1,620 CPU cores and 24 GPUs). This instance of TIREx is open for submissions and will be integrated with the IR Anthology, as well as released open source., Comment: 11 pages. To be published in the proceedings of SIGIR 2023
- Published
- 2023
- Full Text
- View/download PDF
11. Perspectives on Large Language Models for Relevance Judgment
- Author
-
Faggioli, Guglielmo, Dietz, Laura, Clarke, Charles, Demartini, Gianluca, Hagen, Matthias, Hauff, Claudia, Kando, Noriko, Kanoulas, Evangelos, Potthast, Martin, Stein, Benno, and Wachsmuth, Henning
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computers and Society ,H.3.3 - Abstract
When asked, large language models (LLMs) like ChatGPT claim that they can assist with relevance judgments but it is not clear whether automated judgments can reliably be used in evaluations of retrieval systems. In this perspectives paper, we discuss possible ways for LLMs to support relevance judgments along with concerns and issues that arise. We devise a human--machine collaboration spectrum that allows to categorize different relevance judgment strategies, based on how much humans rely on machines. For the extreme point of "fully automated judgments", we further include a pilot experiment on whether LLM-based relevance judgments correlate with judgments from trained human assessors. We conclude the paper by providing opposing perspectives for and against the use of~LLMs for automatic relevance judgments, and a compromise perspective, informed by our analyses of the literature, our preliminary experimental evidence, and our experience as IR researchers.
- Published
- 2023
- Full Text
- View/download PDF
12. The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives
- Author
-
Reimer, Jan Heinrich, Schmidt, Sebastian, Fröbe, Maik, Gienapp, Lukas, Scells, Harrisen, Stein, Benno, Hagen, Matthias, and Potthast, Martin
- Subjects
Computer Science - Information Retrieval - Abstract
The Archive Query Log (AQL) is a previously unused, comprehensive query log collected at the Internet Archive over the last 25 years. Its first version includes 356 million queries, 166 million search result pages, and 1.7 billion search results across 550 search providers. Although many query logs have been studied in the literature, the search providers that own them generally do not publish their logs to protect user privacy and vital business data. Of the few query logs publicly available, none combines size, scope, and diversity. The AQL is the first to do so, enabling research on new retrieval models and (diachronic) search engine analyses. Provided in a privacy-preserving manner, it promotes open research as well as more transparency and accountability in the search industry., Comment: SIGIR 2023 resource paper, 13 pages
- Published
- 2023
- Full Text
- View/download PDF
13. The Touch\'e23-ValueEval Dataset for Identifying Human Values behind Arguments
- Author
-
Mirzakhmedova, Nailia, Kiesel, Johannes, Alshomary, Milad, Heinrich, Maximilian, Handke, Nicolas, Cai, Xiaoni, Valentin, Barriere, Dastgheib, Doratossadat, Ghahroodi, Omid, Sadraei, Mohammad Ali, Asgari, Ehsaneddin, Kawaletz, Lea, Wachsmuth, Henning, and Stein, Benno
- Subjects
Computer Science - Computation and Language - Abstract
We present the Touch\'e23-ValueEval Dataset for Identifying Human Values behind Arguments. To investigate approaches for the automated detection of human values behind arguments, we collected 9324 arguments from 6 diverse sources, covering religious texts, political discussions, free-text arguments, newspaper editorials, and online democracy platforms. Each argument was annotated by 3 crowdworkers for 54 values. The Touch\'e23-ValueEval dataset extends the Webis-ArgValues-22. In comparison to the previous dataset, the effectiveness of a 1-Baseline decreases, but that of an out-of-the-box BERT model increases. Therefore, though the classification difficulty increased as per the label distribution, the larger dataset allows for training better models.
- Published
- 2023
14. Paraphrase Acquisition from Image Captions
- Author
-
Gohsen, Marcel, Hagen, Matthias, Potthast, Martin, and Stein, Benno
- Subjects
Computer Science - Computation and Language - Abstract
We propose to use image captions from the Web as a previously underutilized resource for paraphrases (i.e., texts with the same "message") and to create and analyze a corresponding dataset. When an image is reused on the Web, an original caption is often assigned. We hypothesize that different captions for the same image naturally form a set of mutual paraphrases. To demonstrate the suitability of this idea, we analyze captions in the English Wikipedia, where editors frequently relabel the same image for different articles. The paper introduces the underlying mining technology, the resulting Wikipedia-IPC dataset, and compares known paraphrase corpora with respect to their syntactic and semantic paraphrase similarity to our new resource. In this context, we introduce characteristic maps along the two similarity dimensions to identify the style of paraphrases coming from different sources. An annotation study demonstrates the high reliability of the algorithmically determined characteristic maps.
- Published
- 2023
15. Topic Ontologies for Arguments
- Author
-
Ajjour, Yamen, Kiesel, Johannes, Stein, Benno, and Potthast, Martin
- Subjects
Computer Science - Computation and Language - Abstract
Many computational argumentation tasks, like stance classification, are topic-dependent: the effectiveness of approaches to these tasks significantly depends on whether the approaches were trained on arguments from the same topics as those they are tested on. So, which are these topics that researchers train approaches on? This paper contributes the first comprehensive survey of topic coverage, assessing 45 argument corpora. For the assessment, we take the first step towards building an argument topic ontology, consulting three diverse authoritative sources: the World Economic Forum, the Wikipedia list of controversial topics, and Debatepedia. Comparing the topic sets between the authoritative sources and corpora, our analysis shows that the corpora topics-which are mostly those frequently discussed in public online fora - are covered well by the sources. However, other topics from the sources are less extensively covered by the corpora of today, revealing interesting future directions for corpus construction.
- Published
- 2023
16. Overview of Touché 2024: Argumentation Systems
- Author
-
Kiesel, Johannes, Çöltekin, Çağrı, Heinrich, Maximilian, Fröbe, Maik, Alshomary, Milad, De Longueville, Bertrand, Erjavec, Tomaž, Handke, Nicolas, Kopp, Matyáš, Ljubešić, Nikola, Meden, Katja, Mirzhakhmedova, Nailia, Morkevičius, Vaidas, Reitis-Münstermann, Theresa, Scharfbillig, Mario, Stefanovitch, Nicolas, Wachsmuth, Henning, Potthast, Martin, Stein, Benno, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goeuriot, Lorraine, editor, Mulhem, Philippe, editor, Quénot, Georges, editor, Schwab, Didier, editor, Di Nunzio, Giorgio Maria, editor, Soulier, Laure, editor, Galuščáková, Petra, editor, García Seco de Herrera, Alba, editor, Faggioli, Guglielmo, editor, and Ferro, Nicola, editor
- Published
- 2024
- Full Text
- View/download PDF
17. Overview of the ImageCLEF 2024: Multimedia Retrieval in Medical Applications
- Author
-
Ionescu, Bogdan, Müller, Henning, Drăgulinescu, Ana-Maria, Rückert, Johannes, Ben Abacha, Asma, García Seco de Herrera, Alba, Bloch, Louise, Brüngel, Raphael, Idrissi-Yaghir, Ahmad, Schäfer, Henning, Schmidt, Cynthia Sabrina, Pakull, Tabea M. G., Damm, Hendrik, Bracke, Benjamin, Friedrich, Christoph M., Andrei, Alexandra-Georgiana, Prokopchuk, Yuri, Karpenka, Dzmitry, Radzhabov, Ahmedkhan, Kovalev, Vassili, Macaire, Cécile, Schwab, Didier, Lecouteux, Benjamin, Esperança-Rodier, Emmanuelle, Yim, Wen-Wai, Fu, Yujuan, Sun, Zhaoyi, Yetisgen, Meliha, Xia, Fei, Hicks, Steven A., Riegler, Michael A., Thambawita, Vajira, Storås, Andrea, Halvorsen, Pål, Heinrich, Maximilian, Kiesel, Johannes, Potthast, Martin, Stein, Benno, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goeuriot, Lorraine, editor, Mulhem, Philippe, editor, Quénot, Georges, editor, Schwab, Didier, editor, Di Nunzio, Giorgio Maria, editor, Soulier, Laure, editor, Galuščáková, Petra, editor, García Seco de Herrera, Alba, editor, Faggioli, Guglielmo, editor, and Ferro, Nicola, editor
- Published
- 2024
- Full Text
- View/download PDF
18. Overview of PAN 2024: Multi-author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification Condensed Lab Overview
- Author
-
Ayele, Abinew Ali, Babakov, Nikolay, Bevendorff, Janek, Casals, Xavier Bonet, Chulvi, Berta, Dementieva, Daryna, Elnagar, Ashaf, Freitag, Dayne, Fröbe, Maik, Korenčić, Damir, Mayerl, Maximilian, Moskovskiy, Daniil, Mukherjee, Animesh, Panchenko, Alexander, Potthast, Martin, Rangel, Francisco, Rizwan, Naquee, Rosso, Paolo, Schneider, Florian, Smirnova, Alisa, Stamatatos, Efstathios, Stakovskii, Elisei, Stein, Benno, Taulé, Mariona, Ustalov, Dmitry, Wang, Xintong, Wiegmann, Matti, Yimam, Seid Muhie, Zangerle, Eva, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goeuriot, Lorraine, editor, Mulhem, Philippe, editor, Quénot, Georges, editor, Schwab, Didier, editor, Di Nunzio, Giorgio Maria, editor, Soulier, Laure, editor, Galuščáková, Petra, editor, García Seco de Herrera, Alba, editor, Faggioli, Guglielmo, editor, and Ferro, Nicola, editor
- Published
- 2024
- Full Text
- View/download PDF
19. De-noising Document Classification Benchmarks via Prompt-Based Rank Pruning: A Case Study
- Author
-
Wiegmann, Matti, Stein, Benno, Potthast, Martin, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goeuriot, Lorraine, editor, Mulhem, Philippe, editor, Quénot, Georges, editor, Schwab, Didier, editor, Di Nunzio, Giorgio Maria, editor, Soulier, Laure, editor, Galuščáková, Petra, editor, García Seco de Herrera, Alba, editor, Faggioli, Guglielmo, editor, and Ferro, Nicola, editor
- Published
- 2024
- Full Text
- View/download PDF
20. Who Will Evaluate the Evaluators? Exploring the Gen-IR User Simulation Space
- Author
-
Kiesel, Johannes, Gohsen, Marcel, Mirzakhmedova, Nailia, Hagen, Matthias, Stein, Benno, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goeuriot, Lorraine, editor, Mulhem, Philippe, editor, Quénot, Georges, editor, Schwab, Didier, editor, Di Nunzio, Giorgio Maria, editor, Soulier, Laure, editor, Galuščáková, Petra, editor, García Seco de Herrera, Alba, editor, Faggioli, Guglielmo, editor, and Ferro, Nicola, editor
- Published
- 2024
- Full Text
- View/download PDF
21. The Open Web Index : Crawling and Indexing the Web for Public Use
- Author
-
Hendriksen, Gijs, Dinzinger, Michael, Farzana, Sheikh Mastura, Fathima, Noor Afshan, Fröbe, Maik, Schmidt, Sebastian, Zerhoudi, Saber, Granitzer, Michael, Hagen, Matthias, Hiemstra, Djoerd, Potthast, Martin, Stein, Benno, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goharian, Nazli, editor, Tonellotto, Nicola, editor, He, Yulan, editor, Lipani, Aldo, editor, McDonald, Graham, editor, Macdonald, Craig, editor, and Ounis, Iadh, editor
- Published
- 2024
- Full Text
- View/download PDF
22. Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search Engines
- Author
-
Bevendorff, Janek, Wiegmann, Matti, Potthast, Martin, Stein, Benno, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goharian, Nazli, editor, Tonellotto, Nicola, editor, He, Yulan, editor, Lipani, Aldo, editor, McDonald, Graham, editor, Macdonald, Craig, editor, and Ounis, Iadh, editor
- Published
- 2024
- Full Text
- View/download PDF
23. Advancing Multimedia Retrieval in Medical, Social Media and Content Recommendation Applications with ImageCLEF 2024
- Author
-
Ionescu, Bogdan, Müller, Henning, Drăgulinescu, Ana Maria, Idrissi-Yaghir, Ahmad, Radzhabov, Ahmedkhan, Herrera, Alba Garcia Seco de, Andrei, Alexandra, Stan, Alexandru, Storås, Andrea M., Abacha, Asma Ben, Lecouteux, Benjamin, Stein, Benno, Macaire, Cécile, Friedrich, Christoph M., Schmidt, Cynthia Sabrina, Schwab, Didier, Esperança-Rodier, Emmanuelle, Ioannidis, George, Adams, Griffin, Schäfer, Henning, Manguinhas, Hugo, Coman, Ioan, Schöler, Johanna, Kiesel, Johannes, Rückert, Johannes, Bloch, Louise, Potthast, Martin, Heinrich, Maximilian, Yetisgen, Meliha, Riegler, Michael A., Snider, Neal, Halvorsen, Pål, Brüngel, Raphael, Hicks, Steven A., Thambawita, Vajira, Kovalev, Vassili, Prokopchuk, Yuri, Yim, Wen-Wai, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goharian, Nazli, editor, Tonellotto, Nicola, editor, He, Yulan, editor, Lipani, Aldo, editor, McDonald, Graham, editor, Macdonald, Craig, editor, and Ounis, Iadh, editor
- Published
- 2024
- Full Text
- View/download PDF
24. Overview of PAN 2024: Multi-author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification : Extended Abstract
- Author
-
Bevendorff, Janek, Casals, Xavier Bonet, Chulvi, Berta, Dementieva, Daryna, Elnagar, Ashaf, Freitag, Dayne, Fröbe, Maik, Korenčić, Damir, Mayerl, Maximilian, Mukherjee, Animesh, Panchenko, Alexander, Potthast, Martin, Rangel, Francisco, Rosso, Paolo, Smirnova, Alisa, Stamatatos, Efstathios, Stein, Benno, Taulé, Mariona, Ustalov, Dmitry, Wiegmann, Matti, Zangerle, Eva, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goharian, Nazli, editor, Tonellotto, Nicola, editor, He, Yulan, editor, Lipani, Aldo, editor, McDonald, Graham, editor, Macdonald, Craig, editor, and Ounis, Iadh, editor
- Published
- 2024
- Full Text
- View/download PDF
25. Simulating Follow-Up Questions in Conversational Search
- Author
-
Kiesel, Johannes, Gohsen, Marcel, Mirzakhmedova, Nailia, Hagen, Matthias, Stein, Benno, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goharian, Nazli, editor, Tonellotto, Nicola, editor, He, Yulan, editor, Lipani, Aldo, editor, McDonald, Graham, editor, Macdonald, Craig, editor, and Ounis, Iadh, editor
- Published
- 2024
- Full Text
- View/download PDF
26. The Infinite Index: Information Retrieval on Generative Text-To-Image Models
- Author
-
Deckers, Niklas, Fröbe, Maik, Kiesel, Johannes, Pandolfo, Gianluca, Schröder, Christopher, Stein, Benno, and Potthast, Martin
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Conditional generative models such as DALL-E and Stable Diffusion generate images based on a user-defined text, the prompt. Finding and refining prompts that produce a desired image has become the art of prompt engineering. Generative models do not provide a built-in retrieval model for a user's information need expressed through prompts. In light of an extensive literature review, we reframe prompt engineering for generative models as interactive text-based retrieval on a novel kind of "infinite index". We apply these insights for the first time in a case study on image generation for game design with an expert. Finally, we envision how active learning may help to guide the retrieval of generated images., Comment: Final version for CHIIR 2023
- Published
- 2022
- Full Text
- View/download PDF
27. SMAuC -- The Scientific Multi-Authorship Corpus
- Author
-
Bevendorff, Janek, Sauer, Philipp, Gienapp, Lukas, Kircheis, Wolfgang, Körner, Erik, Stein, Benno, and Potthast, Martin
- Subjects
Computer Science - Computation and Language ,Computer Science - Digital Libraries - Abstract
The rapidly growing volume of scientific publications offers an interesting challenge for research on methods for analyzing the authorship of documents with one or more authors. However, most existing datasets lack scientific documents or the necessary metadata for constructing new experiments and test cases. We introduce SMAuC, a comprehensive, metadata-rich corpus tailored to scientific authorship analysis. Comprising over 3 million publications across various disciplines from over 5 million authors, SMAuC is the largest openly accessible corpus for this purpose. It encompasses scientific texts from humanities and natural sciences, accompanied by extensive, curated metadata, including unambiguous author IDs. SMAuC aims to significantly advance the domain of authorship analysis in scientific texts.
- Published
- 2022
28. Differential Bias: On the Perceptibility of Stance Imbalance in Argumentation
- Author
-
Palomino, Alonso, Potthast, Martin, Al-Khatib, Khalid, and Stein, Benno
- Subjects
Computer Science - Computation and Language ,Computer Science - Information Retrieval - Abstract
Most research on natural language processing treats bias as an absolute concept: Based on a (probably complex) algorithmic analysis, a sentence, an article, or a text is classified as biased or not. Given the fact that for humans the question of whether a text is biased can be difficult to answer or is answered contradictory, we ask whether an "absolute bias classification" is a promising goal at all. We see the problem not in the complexity of interpreting language phenomena but in the diversity of sociocultural backgrounds of the readers, which cannot be handled uniformly: To decide whether a text has crossed the proverbial line between non-biased and biased is subjective. By asking "Is text X more [less, equally] biased than text Y?" we propose to analyze a simpler problem, which, by its construction, is rather independent of standpoints, views, or sociocultural aspects. In such a model, bias becomes a preference relation that induces a partial ordering from least biased to most biased texts without requiring a decision on where to draw the line. A prerequisite for this kind of bias model is the ability of humans to perceive relative bias differences in the first place. In our research, we selected a specific type of bias in argumentation, the stance bias, and designed a crowdsourcing study showing that differences in stance bias are perceptible when (light) support is provided through training or visual aid., Comment: Accepted at AACL-IJCNLP 2022, Findings Volume
- Published
- 2022
29. Trigger Warnings: Bootstrapping a Violence Detector for FanFiction
- Author
-
Wolska, Magdalena, Schröder, Christopher, Borchardt, Ole, Stein, Benno, and Potthast, Martin
- Subjects
Computer Science - Computation and Language - Abstract
We present the first dataset and evaluation results on a newly defined computational task of trigger warning assignment. Labeled corpus data has been compiled from narrative works hosted on Archive of Our Own (AO3), a well-known fanfiction site. In this paper, we focus on the most frequently assigned trigger type--violence--and define a document-level binary classification task of whether or not to assign a violence trigger warning to a fanfiction, exploiting warning labels provided by AO3 authors. SVM and BERT models trained in four evaluation setups on the corpora we compiled yield $F_1$ results ranging from 0.585 to 0.798, proving the violence trigger warning assignment to be a doable, however, non-trivial task., Comment: 5 pages
- Published
- 2022
30. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
- Author
-
Srivastava, Aarohi, Rastogi, Abhinav, Rao, Abhishek, Shoeb, Abu Awal Md, Abid, Abubakar, Fisch, Adam, Brown, Adam R., Santoro, Adam, Gupta, Aditya, Garriga-Alonso, Adrià, Kluska, Agnieszka, Lewkowycz, Aitor, Agarwal, Akshat, Power, Alethea, Ray, Alex, Warstadt, Alex, Kocurek, Alexander W., Safaya, Ali, Tazarv, Ali, Xiang, Alice, Parrish, Alicia, Nie, Allen, Hussain, Aman, Askell, Amanda, Dsouza, Amanda, Slone, Ambrose, Rahane, Ameet, Iyer, Anantharaman S., Andreassen, Anders, Madotto, Andrea, Santilli, Andrea, Stuhlmüller, Andreas, Dai, Andrew, La, Andrew, Lampinen, Andrew, Zou, Andy, Jiang, Angela, Chen, Angelica, Vuong, Anh, Gupta, Animesh, Gottardi, Anna, Norelli, Antonio, Venkatesh, Anu, Gholamidavoodi, Arash, Tabassum, Arfa, Menezes, Arul, Kirubarajan, Arun, Mullokandov, Asher, Sabharwal, Ashish, Herrick, Austin, Efrat, Avia, Erdem, Aykut, Karakaş, Ayla, Roberts, B. Ryan, Loe, Bao Sheng, Zoph, Barret, Bojanowski, Bartłomiej, Özyurt, Batuhan, Hedayatnia, Behnam, Neyshabur, Behnam, Inden, Benjamin, Stein, Benno, Ekmekci, Berk, Lin, Bill Yuchen, Howald, Blake, Orinion, Bryan, Diao, Cameron, Dour, Cameron, Stinson, Catherine, Argueta, Cedrick, Ramírez, César Ferri, Singh, Chandan, Rathkopf, Charles, Meng, Chenlin, Baral, Chitta, Wu, Chiyu, Callison-Burch, Chris, Waites, Chris, Voigt, Christian, Manning, Christopher D., Potts, Christopher, Ramirez, Cindy, Rivera, Clara E., Siro, Clemencia, Raffel, Colin, Ashcraft, Courtney, Garbacea, Cristina, Sileo, Damien, Garrette, Dan, Hendrycks, Dan, Kilman, Dan, Roth, Dan, Freeman, Daniel, Khashabi, Daniel, Levy, Daniel, González, Daniel Moseguí, Perszyk, Danielle, Hernandez, Danny, Chen, Danqi, Ippolito, Daphne, Gilboa, Dar, Dohan, David, Drakard, David, Jurgens, David, Datta, Debajyoti, Ganguli, Deep, Emelin, Denis, Kleyko, Denis, Yuret, Deniz, Chen, Derek, Tam, Derek, Hupkes, Dieuwke, Misra, Diganta, Buzan, Dilyar, Mollo, Dimitri Coelho, Yang, Diyi, Lee, Dong-Ho, Schrader, Dylan, Shutova, Ekaterina, Cubuk, Ekin Dogus, Segal, Elad, Hagerman, Eleanor, Barnes, Elizabeth, Donoway, Elizabeth, Pavlick, Ellie, Rodola, Emanuele, Lam, Emma, Chu, Eric, Tang, Eric, Erdem, Erkut, Chang, Ernie, Chi, Ethan A., Dyer, Ethan, Jerzak, Ethan, Kim, Ethan, Manyasi, Eunice Engefu, Zheltonozhskii, Evgenii, Xia, Fanyue, Siar, Fatemeh, Martínez-Plumed, Fernando, Happé, Francesca, Chollet, Francois, Rong, Frieda, Mishra, Gaurav, Winata, Genta Indra, de Melo, Gerard, Kruszewski, Germán, Parascandolo, Giambattista, Mariani, Giorgio, Wang, Gloria, Jaimovitch-López, Gonzalo, Betz, Gregor, Gur-Ari, Guy, Galijasevic, Hana, Kim, Hannah, Rashkin, Hannah, Hajishirzi, Hannaneh, Mehta, Harsh, Bogar, Hayden, Shevlin, Henry, Schütze, Hinrich, Yakura, Hiromu, Zhang, Hongming, Wong, Hugh Mee, Ng, Ian, Noble, Isaac, Jumelet, Jaap, Geissinger, Jack, Kernion, Jackson, Hilton, Jacob, Lee, Jaehoon, Fisac, Jaime Fernández, Simon, James B., Koppel, James, Zheng, James, Zou, James, Kocoń, Jan, Thompson, Jana, Wingfield, Janelle, Kaplan, Jared, Radom, Jarema, Sohl-Dickstein, Jascha, Phang, Jason, Wei, Jason, Yosinski, Jason, Novikova, Jekaterina, Bosscher, Jelle, Marsh, Jennifer, Kim, Jeremy, Taal, Jeroen, Engel, Jesse, Alabi, Jesujoba, Xu, Jiacheng, Song, Jiaming, Tang, Jillian, Waweru, Joan, Burden, John, Miller, John, Balis, John U., Batchelder, Jonathan, Berant, Jonathan, Frohberg, Jörg, Rozen, Jos, Hernandez-Orallo, Jose, Boudeman, Joseph, Guerr, Joseph, Jones, Joseph, Tenenbaum, Joshua B., Rule, Joshua S., Chua, Joyce, Kanclerz, Kamil, Livescu, Karen, Krauth, Karl, Gopalakrishnan, Karthik, Ignatyeva, Katerina, Markert, Katja, Dhole, Kaustubh D., Gimpel, Kevin, Omondi, Kevin, Mathewson, Kory, Chiafullo, Kristen, Shkaruta, Ksenia, Shridhar, Kumar, McDonell, Kyle, Richardson, Kyle, Reynolds, Laria, Gao, Leo, Zhang, Li, Dugan, Liam, Qin, Lianhui, Contreras-Ochando, Lidia, Morency, Louis-Philippe, Moschella, Luca, Lam, Lucas, Noble, Lucy, Schmidt, Ludwig, He, Luheng, Colón, Luis Oliveros, Metz, Luke, Şenel, Lütfi Kerem, Bosma, Maarten, Sap, Maarten, ter Hoeve, Maartje, Farooqi, Maheen, Faruqui, Manaal, Mazeika, Mantas, Baturan, Marco, Marelli, Marco, Maru, Marco, Quintana, Maria Jose Ramírez, Tolkiehn, Marie, Giulianelli, Mario, Lewis, Martha, Potthast, Martin, Leavitt, Matthew L., Hagen, Matthias, Schubert, Mátyás, Baitemirova, Medina Orduna, Arnaud, Melody, McElrath, Melvin, Yee, Michael A., Cohen, Michael, Gu, Michael, Ivanitskiy, Michael, Starritt, Michael, Strube, Michael, Swędrowski, Michał, Bevilacqua, Michele, Yasunaga, Michihiro, Kale, Mihir, Cain, Mike, Xu, Mimee, Suzgun, Mirac, Walker, Mitch, Tiwari, Mo, Bansal, Mohit, Aminnaseri, Moin, Geva, Mor, Gheini, Mozhdeh, T, Mukund Varma, Peng, Nanyun, Chi, Nathan A., Lee, Nayeon, Krakover, Neta Gur-Ari, Cameron, Nicholas, Roberts, Nicholas, Doiron, Nick, Martinez, Nicole, Nangia, Nikita, Deckers, Niklas, Muennighoff, Niklas, Keskar, Nitish Shirish, Iyer, Niveditha S., Constant, Noah, Fiedel, Noah, Wen, Nuan, Zhang, Oliver, Agha, Omar, Elbaghdadi, Omar, Levy, Omer, Evans, Owain, Casares, Pablo Antonio Moreno, Doshi, Parth, Fung, Pascale, Liang, Paul Pu, Vicol, Paul, Alipoormolabashi, Pegah, Liao, Peiyuan, Liang, Percy, Chang, Peter, Eckersley, Peter, Htut, Phu Mon, Hwang, Pinyu, Miłkowski, Piotr, Patil, Piyush, Pezeshkpour, Pouya, Oli, Priti, Mei, Qiaozhu, Lyu, Qing, Chen, Qinlang, Banjade, Rabin, Rudolph, Rachel Etta, Gabriel, Raefer, Habacker, Rahel, Risco, Ramon, Millière, Raphaël, Garg, Rhythm, Barnes, Richard, Saurous, Rif A., Arakawa, Riku, Raymaekers, Robbe, Frank, Robert, Sikand, Rohan, Novak, Roman, Sitelew, Roman, LeBras, Ronan, Liu, Rosanne, Jacobs, Rowan, Zhang, Rui, Salakhutdinov, Ruslan, Chi, Ryan, Lee, Ryan, Stovall, Ryan, Teehan, Ryan, Yang, Rylan, Singh, Sahib, Mohammad, Saif M., Anand, Sajant, Dillavou, Sam, Shleifer, Sam, Wiseman, Sam, Gruetter, Samuel, Bowman, Samuel R., Schoenholz, Samuel S., Han, Sanghyun, Kwatra, Sanjeev, Rous, Sarah A., Ghazarian, Sarik, Ghosh, Sayan, Casey, Sean, Bischoff, Sebastian, Gehrmann, Sebastian, Schuster, Sebastian, Sadeghi, Sepideh, Hamdan, Shadi, Zhou, Sharon, Srivastava, Shashank, Shi, Sherry, Singh, Shikhar, Asaadi, Shima, Gu, Shixiang Shane, Pachchigar, Shubh, Toshniwal, Shubham, Upadhyay, Shyam, Shyamolima, Debnath, Shakeri, Siamak, Thormeyer, Simon, Melzi, Simone, Reddy, Siva, Makini, Sneha Priscilla, Lee, Soo-Hwan, Torene, Spencer, Hatwar, Sriharsha, Dehaene, Stanislas, Divic, Stefan, Ermon, Stefano, Biderman, Stella, Lin, Stephanie, Prasad, Stephen, Piantadosi, Steven T., Shieber, Stuart M., Misherghi, Summer, Kiritchenko, Svetlana, Mishra, Swaroop, Linzen, Tal, Schuster, Tal, Li, Tao, Yu, Tao, Ali, Tariq, Hashimoto, Tatsu, Wu, Te-Lin, Desbordes, Théo, Rothschild, Theodore, Phan, Thomas, Wang, Tianle, Nkinyili, Tiberius, Schick, Timo, Kornev, Timofei, Tunduny, Titus, Gerstenberg, Tobias, Chang, Trenton, Neeraj, Trishala, Khot, Tushar, Shultz, Tyler, Shaham, Uri, Misra, Vedant, Demberg, Vera, Nyamai, Victoria, Raunak, Vikas, Ramasesh, Vinay, Prabhu, Vinay Uday, Padmakumar, Vishakh, Srikumar, Vivek, Fedus, William, Saunders, William, Zhang, William, Vossen, Wout, Ren, Xiang, Tong, Xiaoyu, Zhao, Xinran, Wu, Xinyi, Shen, Xudong, Yaghoobzadeh, Yadollah, Lakretz, Yair, Song, Yangqiu, Bahri, Yasaman, Choi, Yejin, Yang, Yichi, Hao, Yiding, Chen, Yifu, Belinkov, Yonatan, Hou, Yu, Hou, Yufang, Bai, Yuntao, Seid, Zachary, Zhao, Zhuoye, Wang, Zijian, Wang, Zijie J., Wang, Zirui, and Wu, Ziyi
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting., Comment: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench
- Published
- 2022
31. STEREO: Scientific Text Reuse in Open Access Publications
- Author
-
Gienapp, Lukas, Kircheis, Wolfgang, Sievers, Bjarne, Stein, Benno, and Potthast, Martin
- Subjects
Computer Science - Digital Libraries ,Computer Science - Computation and Language ,Computer Science - Information Retrieval - Abstract
We present the Webis-STEREO-21 dataset, a massive collection of Scientific Text Reuse in Open-access publications. It contains more than 91 million cases of reused text passages found in 4.2 million unique open-access publications. Featuring a high coverage of scientific disciplines and varieties of reuse, as well as comprehensive metadata to contextualize each case, our dataset addresses the most salient shortcomings of previous ones on scientific writing. Webis-STEREO-21 allows for tackling a wide range of research questions from different scientific backgrounds, facilitating both qualitative and quantitative analysis of the phenomenon as well as a first-time grounding on the base rate of text reuse in scientific publications., Comment: 14 pages, 3 figures, 4 tables
- Published
- 2021
32. FastWARC: Optimizing Large-Scale Web Archive Analytics
- Author
-
Bevendorff, Janek, Potthast, Martin, and Stein, Benno
- Subjects
Computer Science - Information Retrieval - Abstract
Web search and other large-scale web data analytics rely on processing archives of web pages stored in a standardized and efficient format. Since its introduction in 2008, the IIPC's Web ARCive (WARC) format has become the standard format for this purpose. As a list of individually compressed records of HTTP requests and responses, it allows for constant-time random access to all kinds of web data via off-the-shelf open source parsers in many programming languages, such as WARCIO, the de-facto standard for Python. When processing web archives at the terabyte or petabyte scale, however, even small inefficiencies in these tools add up quickly, resulting in hours, days, or even weeks of wasted compute time. Reviewing the basic components of WARCIO and analyzing its bottlenecks, we proceed to build FastWARC, a new high-performance WARC processing library for Python, written in C++/Cython, which yields performance improvements by factors of 1.6-8x.
- Published
- 2021
33. The Impact of Main Content Extraction on Near-Duplicate Detection
- Author
-
Fröbe, Maik, Hagen, Matthias, Bevendorff, Janek, Völske, Michael, Stein, Benno, Schröder, Christopher, Wagner, Robby, Gienapp, Lukas, and Potthast, Martin
- Subjects
Computer Science - Information Retrieval - Abstract
Commercial web search engines employ near-duplicate detection to ensure that users see each relevant result only once, albeit the underlying web crawls typically include (near-)duplicates of many web pages. We revisit the risks and potential of near-duplicates with an information retrieval focus, motivating that current efforts toward an open and independent European web search infrastructure should maintain metadata on duplicate and near-duplicate documents in its index. Near-duplicate detection implemented in an open web search infrastructure should provide a suitable similarity threshold, a difficult choice since identical pages may substantially differ in parts of a page that are irrelevant to searchers (templates, advertisements, etc.). We study this problem by comparing the similarity of pages for five (main) content extraction methods in two studies on the ClueWeb crawls. We find that the full content of pages serves precision-oriented near-duplicate-detection, while main content extraction is more recall-oriented.
- Published
- 2021
34. Protokoll 24
- Author
-
Stein, Benno, primary, Simons, Arno, additional, Potthast, Martin, additional, Hagen, Saskia, additional, and Wörner, Kai, additional
- Published
- 2023
- Full Text
- View/download PDF
35. Controlled Neural Sentence-Level Reframing of News Articles
- Author
-
Chen, Wei-Fan, Al-Khatib, Khalid, Stein, Benno, and Wachsmuth, Henning
- Subjects
Computer Science - Computation and Language - Abstract
Framing a news article means to portray the reported event from a specific perspective, e.g., from an economic or a health perspective. Reframing means to change this perspective. Depending on the audience or the submessage, reframing can become necessary to achieve the desired effect on the readers. Reframing is related to adapting style and sentiment, which can be tackled with neural text generation techniques. However, it is more challenging since changing a frame requires rewriting entire sentences rather than single phrases. In this paper, we study how to computationally reframe sentences in news articles while maintaining their coherence to the context. We treat reframing as a sentence-level fill-in-the-blank task for which we train neural models on an existing media frame corpus. To guide the training, we propose three strategies: framed-language pretraining, named-entity preservation, and adversarial learning. We evaluate respective models automatically and manually for topic consistency, coherence, and successful reframing. Our results indicate that generating properly-framed text works well but with tradeoffs.
- Published
- 2021
36. Web Archive Analytics
- Author
-
Völske, Michael, Bevendorff, Janek, Kiesel, Johannes, Stein, Benno, Fröbe, Maik, Hagen, Matthias, and Potthast, Martin
- Subjects
Computer Science - Digital Libraries ,Computer Science - Networking and Internet Architecture ,Computer Science - Social and Information Networks - Abstract
Web archive analytics is the exploitation of publicly accessible web pages and their evolution for research purposes -- to the extent organizationally possible for researchers. In order to better understand the complexity of this task, the first part of this paper puts the entirety of the world's captured, created, and replicated data (the "Global Datasphere") in relation to other important data sets such as the public internet and its web pages, or what is preserved thereof by the Internet Archive. Recently, the Webis research group, a network of university chairs to which the authors belong, concluded an agreement with the Internet Archive to download a substantial part of its web archive for research purposes. The second part of the paper in hand describes our infrastructure for processing this data treasure: We will eventually host around 8 PB of web archive data from the Internet Archive and Common Crawl, with the goal of supplementing existing large scale web corpora and forming a non-biased subset of the 30 PB web archive at the Internet Archive., Comment: 12 pages, 5 figures. Published in the proceedings of INFORMATIK 2020
- Published
- 2021
- Full Text
- View/download PDF
37. Towards Axiomatic Explanations for Neural Ranking Models
- Author
-
Völske, Michael, Bondarenko, Alexander, Fröbe, Maik, Hagen, Matthias, Stein, Benno, Singh, Jaspreet, and Anand, Avishek
- Subjects
Computer Science - Information Retrieval - Abstract
Recently, neural networks have been successfully employed to improve upon state-of-the-art performance in ad-hoc retrieval tasks via machine-learned ranking functions. While neural retrieval models grow in complexity and impact, little is understood about their correspondence with well-studied IR principles. Recent work on interpretability in machine learning has provided tools and techniques to understand neural models in general, yet there has been little progress towards explaining ranking models. We investigate whether one can explain the behavior of neural ranking models in terms of their congruence with well understood principles of document ranking by using established theories from axiomatic IR. Axiomatic analysis of information retrieval models has formalized a set of constraints on ranking decisions that reasonable retrieval models should fulfill. We operationalize this axiomatic thinking to reproduce rankings based on combinations of elementary constraints. This allows us to investigate to what extent the ranking decisions of neural rankers can be explained in terms of retrieval axioms, and which axioms apply in which situations. Our experimental study considers a comprehensive set of axioms over several representative neural rankers. While the existing axioms can already explain the particularly confident ranking decisions rather well, future work should extend the axiom set to also cover the other still "unexplainable" neural IR rank decisions., Comment: 10 pages, 2 figures. Published in the proceedings of ICTIR 2021
- Published
- 2021
- Full Text
- View/download PDF
38. Demanded Abstract Interpretation (Extended Version)
- Author
-
Stein, Benno, Chang, Bor-Yuh Evan, and Sridharan, Manu
- Subjects
Computer Science - Programming Languages - Abstract
We consider the problem of making expressive static analyzers interactive. Formal static analysis is seeing increasingly widespread adoption as a tool for verification and bug-finding, but even with powerful cloud infrastructure it can take minutes or hours to get batch analysis results after a code change. While existing techniques offer some demand-driven or incremental aspects for certain classes of analysis, the fundamental challenge we tackle is doing both for arbitrary abstract interpreters. Our technique, demanded abstract interpretation, lifts program syntax and analysis state to a dynamically evolving graph structure, in which program edits, client-issued queries, and evaluation of abstract semantics are all treated uniformly. The key difficulty addressed by our approach is the application of general incremental computation techniques to the complex, cyclic dependency structure induced by abstract interpretation of loops with widening operators. We prove that desirable abstract interpretation meta-properties, including soundness and termination, are preserved in our approach, and that demanded analysis results are equal to those computed by a batch abstract interpretation. Experimental results suggest promise for a prototype demanded abstract interpretation framework: by combining incremental and demand-driven techniques, our framework consistently delivers analysis results at interactive speeds, answering 95% of queries within 1.2 seconds., Comment: extended version of PLDI'21 paper (with appendices)
- Published
- 2021
- Full Text
- View/download PDF
39. Overview of Touché 2023: Argument and Causal Retrieval
- Author
-
Bondarenko, Alexander, Fröbe, Maik, Kiesel, Johannes, Schlatt, Ferdinand, Barriere, Valentin, Ravenet, Brian, Hemamou, Léo, Luck, Simon, Reimer, Jan Heinrich, Stein, Benno, Potthast, Martin, Hagen, Matthias, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Arampatzis, Avi, editor, Kanoulas, Evangelos, editor, Tsikrika, Theodora, editor, Vrochidis, Stefanos, editor, Giachanou, Anastasia, editor, Li, Dan, editor, Aliannejadi, Mohammad, editor, Vlachos, Michalis, editor, Faggioli, Guglielmo, editor, and Ferro, Nicola, editor
- Published
- 2023
- Full Text
- View/download PDF
40. Overview of PAN 2023: Authorship Verification, Multi-Author Writing Style Analysis, Profiling Cryptocurrency Influencers, and Trigger Detection : Condensed Lab Overview
- Author
-
Bevendorff, Janek, Borrego-Obrador, Ian, Chinea-Ríos, Mara, Franco-Salvador, Marc, Fröbe, Maik, Heini, Annina, Kredens, Krzysztof, Mayerl, Maximilian, Pęzik, Piotr, Potthast, Martin, Rangel, Francisco, Rosso, Paolo, Stamatatos, Efstathios, Stein, Benno, Wiegmann, Matti, Wolska, Magdalena, Zangerle, Eva, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Arampatzis, Avi, editor, Kanoulas, Evangelos, editor, Tsikrika, Theodora, editor, Vrochidis, Stefanos, editor, Giachanou, Anastasia, editor, Li, Dan, editor, Aliannejadi, Mohammad, editor, Vlachos, Michalis, editor, Faggioli, Guglielmo, editor, and Ferro, Nicola, editor
- Published
- 2023
- Full Text
- View/download PDF
41. Overview of PAN 2023: Authorship Verification, Multi-author Writing Style Analysis, Profiling Cryptocurrency Influencers, and Trigger Detection : Extended Abstract
- Author
-
Bevendorff, Janek, Chinea-Ríos, Mara, Franco-Salvador, Marc, Heini, Annina, Körner, Erik, Kredens, Krzysztof, Mayerl, Maximilian, Pęzik, Piotr, Potthast, Martin, Rangel, Francisco, Rosso, Paolo, Stamatatos, Efstathios, Stein, Benno, Wiegmann, Matti, Wolska, Magdalena, Zangerle, Eva, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Kamps, Jaap, editor, Goeuriot, Lorraine, editor, Crestani, Fabio, editor, Maistro, Maria, editor, Joho, Hideo, editor, Davis, Brian, editor, Gurrin, Cathal, editor, Kruschwitz, Udo, editor, and Caputo, Annalina, editor
- Published
- 2023
- Full Text
- View/download PDF
42. Overview of Touché 2023: Argument and Causal Retrieval : Extended Abstract
- Author
-
Bondarenko, Alexander, Fröbe, Maik, Kiesel, Johannes, Schlatt, Ferdinand, Barriere, Valentin, Ravenet, Brian, Hemamou, Léo, Luck, Simon, Reimer, Jan Heinrich, Stein, Benno, Potthast, Martin, Hagen, Matthias, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Kamps, Jaap, editor, Goeuriot, Lorraine, editor, Crestani, Fabio, editor, Maistro, Maria, editor, Joho, Hideo, editor, Davis, Brian, editor, Gurrin, Cathal, editor, Kruschwitz, Udo, editor, and Caputo, Annalina, editor
- Published
- 2023
- Full Text
- View/download PDF
43. Continuous Integration for Reproducible Shared Tasks with TIRA.io
- Author
-
Fröbe, Maik, Wiegmann, Matti, Kolyada, Nikolay, Grahm, Bastian, Elstner, Theresa, Loebe, Frank, Hagen, Matthias, Stein, Benno, Potthast, Martin, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Kamps, Jaap, editor, Goeuriot, Lorraine, editor, Crestani, Fabio, editor, Maistro, Maria, editor, Joho, Hideo, editor, Davis, Brian, editor, Gurrin, Cathal, editor, Kruschwitz, Udo, editor, and Caputo, Annalina, editor
- Published
- 2023
- Full Text
- View/download PDF
44. Dynamic Exploratory Search for the Information Retrieval Anthology
- Author
-
Gollub, Tim, Brockmeyer, Jason, Stein, Benno, Potthast, Martin, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Kamps, Jaap, editor, Goeuriot, Lorraine, editor, Crestani, Fabio, editor, Maistro, Maria, editor, Joho, Hideo, editor, Davis, Brian, editor, Gurrin, Cathal, editor, Kruschwitz, Udo, editor, and Caputo, Annalina, editor
- Published
- 2023
- Full Text
- View/download PDF
45. A diachronic perspective on citation latency in Wikipedia articles on CRISPR/Cas-9: an exploratory case study
- Author
-
Schmidt, Marion, Kircheis, Wolfgang, Simons, Arno, Potthast, Martin, and Stein, Benno
- Published
- 2023
- Full Text
- View/download PDF
46. Analyzing Political Bias and Unfairness in News Articles at Different Levels of Granularity
- Author
-
Chen, Wei-Fan, Al-Khatib, Khalid, Wachsmuth, Henning, and Stein, Benno
- Subjects
Computer Science - Computation and Language - Abstract
Media organizations bear great reponsibility because of their considerable influence on shaping beliefs and positions of our society. Any form of media can contain overly biased content, e.g., by reporting on political events in a selective or incomplete manner. A relevant question hence is whether and how such form of imbalanced news coverage can be exposed. The research presented in this paper addresses not only the automatic detection of bias but goes one step further in that it explores how political bias and unfairness are manifested linguistically. In this regard we utilize a new corpus of 6964 news articles with labels derived from adfontesmedia.com and develop a neural model for bias assessment. By analyzing this model on article excerpts, we find insightful bias patterns at different levels of text granularity, from single words to the whole article discourse.
- Published
- 2020
47. Detecting Media Bias in News Articles using Gaussian Bias Distributions
- Author
-
Chen, Wei-Fan, Al-Khatib, Khalid, Stein, Benno, and Wachsmuth, Henning
- Subjects
Computer Science - Computation and Language - Abstract
Media plays an important role in shaping public opinion. Biased media can influence people in undesirable directions and hence should be unmasked as such. We observe that featurebased and neural text classification approaches which rely only on the distribution of low-level lexical information fail to detect media bias. This weakness becomes most noticeable for articles on new events, where words appear in new contexts and hence their "bias predictiveness" is unclear. In this paper, we therefore study how second-order information about biased statements in an article helps to improve detection effectiveness. In particular, we utilize the probability distributions of the frequency, positions, and sequential order of lexical and informational sentence-level bias in a Gaussian Mixture Model. On an existing media bias dataset, we find that the frequency and positions of biased statements strongly impact article-level bias, whereas their exact sequential order is secondary. Using a standard model for sentence-level bias detection, we provide empirical evidence that article-level bias detectors that use second-order information clearly outperform those without.
- Published
- 2020
48. The Importance of Suppressing Domain Style in Authorship Analysis
- Author
-
Bischoff, Sebastian, Deckers, Niklas, Schliebs, Marcel, Thies, Ben, Hagen, Matthias, Stamatatos, Efstathios, Stein, Benno, and Potthast, Martin
- Subjects
Computer Science - Computation and Language - Abstract
The prerequisite of many approaches to authorship analysis is a representation of writing style. But despite decades of research, it still remains unclear to what extent commonly used and widely accepted representations like character trigram frequencies actually represent an author's writing style, in contrast to more domain-specific style components or even topic. We address this shortcoming for the first time in a novel experimental setup of fixed authors but swapped domains between training and testing. With this setup, we reveal that approaches using character trigram features are highly susceptible to favor domain information when applied without attention to domains, suffering drops of up to 55.4 percentage points in classification accuracy under domain swapping. We further propose a new remedy based on domain-adversarial learning and compare it to ones from the literature based on heuristic rules. Both can work well, reducing accuracy losses under domain swapping to 3.6% and 3.9%, respectively.
- Published
- 2020
49. Conversational Search -- A Report from Dagstuhl Seminar 19461
- Author
-
Anand, Avishek, Cavedon, Lawrence, Hagen, Matthias, Joho, Hideo, Sanderson, Mark, and Stein, Benno
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language ,Computer Science - Human-Computer Interaction - Abstract
Dagstuhl Seminar 19461 "Conversational Search" was held on 10-15 November 2019. 44~researchers in Information Retrieval and Web Search, Natural Language Processing, Human Computer Interaction, and Dialogue Systems were invited to share the latest development in the area of Conversational Search and discuss its research agenda and future directions. A 5-day program of the seminar consisted of six introductory and background sessions, three visionary talk sessions, one industry talk session, and seven working groups and reporting sessions. The seminar also had three social events during the program. This report provides the executive summary, overview of invited talks, and findings from the seven working groups which cover the definition, evaluation, modelling, explanation, scenarios, applications, and prototype of Conversational Search. The ideas and findings presented in this report should serve as one of the main sources for diverse research programs on Conversational Search., Comment: contains arXiv:2001.06910, arXiv:2001.02912
- Published
- 2020
50. Abstractive Snippet Generation
- Author
-
Chen, Wei-Fan, Syed, Shahbaz, Stein, Benno, Hagen, Matthias, and Potthast, Martin
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language - Abstract
An abstractive snippet is an originally created piece of text to summarize a web page on a search engine results page. Compared to the conventional extractive snippets, which are generated by extracting phrases and sentences verbatim from a web page, abstractive snippets circumvent copyright issues; even more interesting is the fact that they open the door for personalization. Abstractive snippets have been evaluated as equally powerful in terms of user acceptance and expressiveness---but the key question remains: Can abstractive snippets be automatically generated with sufficient quality? This paper introduces a new approach to abstractive snippet generation: We identify the first two large-scale sources for distant supervision, namely anchor contexts and web directories. By mining the entire ClueWeb09 and ClueWeb12 for anchor contexts and by utilizing the DMOZ Open Directory Project, we compile the Webis Abstractive Snippet Corpus 2020, comprising more than 3.5 million triples of the form $\langle$query, snippet, document$\rangle$ as training examples, where the snippet is either an anchor context or a web directory description in lieu of a genuine query-biased abstractive snippet of the web document. We propose a bidirectional abstractive snippet generation model and assess the quality of both our corpus and the generated abstractive snippets with standard measures, crowdsourcing, and in comparison to the state of the art. The evaluation shows that our novel data sources along with the proposed model allow for producing usable query-biased abstractive snippets while minimizing text reuse., Comment: Accepted by WWW 2020
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.