Back to Search Start Over

Generating Wikipedia by Summarizing Long Sequences

Authors :
Liu, Peter J.
Saleh, Mohammad
Pot, Etienne
Goodrich, Ben
Sepassi, Ryan
Kaiser, Lukasz
Shazeer, Noam
Publication Year :
2018

Abstract

We show that generating English Wikipedia articles can be approached as a multi- document summarization of source documents. We use extractive summarization to coarsely identify salient information and a neural abstractive model to generate the article. For the abstractive model, we introduce a decoder-only architecture that can scalably attend to very long sequences, much longer than typical encoder- decoder architectures used in sequence transduction. We show that this model can generate fluent, coherent multi-sentence paragraphs and even whole Wikipedia articles. When given reference documents, we show it can extract relevant factual information as reflected in perplexity, ROUGE scores and human evaluations.<br />Comment: Published as a conference paper at ICLR 2018

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.1801.10198
Document Type :
Working Paper