Back to Search Start Over

Surfer100: Generating Surveys From Web Resources, Wikipedia-style

Authors :
Li, Irene
Fabbri, Alexander
Kawamura, Rina
Liu, Yixin
Tang, Xiangru
Tae, Jaesung
Shen, Chang
Ma, Sally
Mizutani, Tomoe
Radev, Dragomir
Publication Year :
2021

Abstract

Fast-developing fields such as Artificial Intelligence (AI) often outpace the efforts of encyclopedic sources such as Wikipedia, which either do not completely cover recently-introduced topics or lack such content entirely. As a result, methods for automatically producing content are valuable tools to address this information overload. We show that recent advances in pretrained language modeling can be combined for a two-stage extractive and abstractive approach for Wikipedia lead paragraph generation. We extend this approach to generate longer Wikipedia-style summaries with sections and examine how such methods struggle in this application through detailed studies with 100 reference human-collected surveys. This is the first study on utilizing web resources for long Wikipedia-style summaries to the best of our knowledge.<br />Comment: LREC 2022, main conference

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2112.06377
Document Type :
Working Paper