Back to Search
Start Over
Query-oriented unsupervised multi-document summarization via deep learning model
- Source :
- Expert Systems with Applications. 42:8146-8155
- Publication Year :
- 2015
- Publisher :
- Elsevier BV, 2015.
-
Abstract
- First attempt of deep learning for query-oriented multi-document summarization.Novel algorithm pushes out important concepts layer by layer effectively.Confirm excellent extraction ability under unsupervised learning framework. Capturing the compositional process from words to documents is a key challenge in natural language processing and information retrieval. Extractive style query-oriented multi-document summarization generates a summary by extracting a proper set of sentences from multiple documents based on pre-given query. This paper proposes a novel document summarization framework based on deep learning model, which has been shown outstanding extraction ability in many real-world applications. The framework consists of three parts: concepts extraction, summary generation, and reconstruction validation. A new query-oriented extraction technique is proposed to extract information distributed in multiple documents. Then, the whole deep architecture is fine-tuned by minimizing the information loss in reconstruction validation. According to the concepts extracted from deep architecture layer by layer, dynamic programming is used to seek most informative set of sentences for the summary. Experiment on three benchmark datasets (DUC 2005, 2006, and 2007) assess and confirm the effectiveness of the proposed framework and algorithms. Experiment results show that the proposed method outperforms state-of-the-art extractive summarization approaches. Moreover, we also provide the statistical analysis of query words based on Amazon's Mechanical Turk (MTurk) crowdsourcing platform. There exists underlying relationships from topic words to the content which can contribute to summarization task.
- Subjects :
- Information retrieval
Computer science
business.industry
Deep learning
General Engineering
Crowdsourcing
Automatic summarization
Computer Science Applications
Set (abstract data type)
Artificial Intelligence
Multi-document summarization
Benchmark (computing)
Key (cryptography)
Unsupervised learning
Artificial intelligence
business
Subjects
Details
- ISSN :
- 09574174
- Volume :
- 42
- Database :
- OpenAIRE
- Journal :
- Expert Systems with Applications
- Accession number :
- edsair.doi...........6b12c2acd35b5f9f8bc9e05ae20437eb
- Full Text :
- https://doi.org/10.1016/j.eswa.2015.05.034