Back to Search
Start Over
Language features in extractive summarization: Humans Vs. Machines.
- Source :
-
Knowledge-Based Systems . Sep2019, Vol. 180, p1-11. 11p. - Publication Year :
- 2019
-
Abstract
- This paper presents a comparative statistical analysis of the language features most commonly used for Automatic Text Summarization (ATS), namely: Parts of Speech (PoS) (unigrams and bigrams), sentiments (by token and sentence), and Rhetorical Structure Theory (RTS) relations. The analyses were carried out on both human-made and machine-made summaries, in order to determine whether current ATS systems capture the same kind of information as humans do. Our results show that there are some marked differences between machine and human-made summaries, which at times may seem counterintuitive. For instance, named entities were usually frequent in machine-made summaries, but not in human-made ones. Similarly, words perceived to hold a "neutral" sentiment were systematically favored by machines, but not always by humans. • This paper investigates pertinence of language features commonly utilized in ATS. • Statistical comparisons were conducted between human-made and machine-made summaries. • Some features were interestingly used moderately by humans, but not by machines. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 09507051
- Volume :
- 180
- Database :
- Academic Search Index
- Journal :
- Knowledge-Based Systems
- Publication Type :
- Academic Journal
- Accession number :
- 136934192
- Full Text :
- https://doi.org/10.1016/j.knosys.2019.05.014