Back to Search Start Over

Jointly Learning Topics in Sentence Embedding for Document Summarization.

Authors :
Gao, Yang
Xu, Yue
Huang, Heyan
Liu, Qian
Wei, Linjing
Liu, Luyang
Source :
IEEE Transactions on Knowledge & Data Engineering; Apr2020, Vol. 32 Issue 4, p688-699, 12p
Publication Year :
2020

Abstract

Summarization systems for various applications, such as opinion mining, online news services, and answering questions, have attracted increasing attention in recent years. These tasks are complicated, and a classic representation using bag-of-words does not adequately meet the comprehensive needs of applications that rely on sentence extraction. In this paper, we focus on representing sentences as continuous vectors as a basis for measuring relevance between user needs and candidate sentences in source documents. Embedding models based on distributed vector representations are often used in the summarization community because, through cosine similarity, they simplify sentence relevance when comparing two sentences or a sentence/query and a document. However, the vector-based embedding models do not typically account for the salience of a sentence, and this is a very necessary part of document summarization. To incorporate sentence salience, we developed a model, called CCTSenEmb, that learns latent discriminative Gaussian topics in the embedding space and extended the new framework by seamlessly incorporating both topic and sentence embedding into one summarization system. To facilitate the semantic coherence between sentences in the framework of prediction-based tasks for sentence embedding, the CCTSenEmb further considers the associations between neighboring sentences. As a result, this novel sentence embedding framework combines sentence representations, word-based content, and topic assignments to predict the representation of the next sentence. A series of experiments with the DUC datasets validate CCTSenEmb's efficacy in document summarization in a query-focused extraction-based setting and an unsupervised ILP-based setting. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10414347
Volume :
32
Issue :
4
Database :
Complementary Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
143313729
Full Text :
https://doi.org/10.1109/TKDE.2019.2892430