Back to Search Start Over

基于改进 T5 PEGASUS 模型的新闻文本摘要生成.

Authors :
张琪
范永胜
Source :
Electronic Science & Technology. 2023, Vol. 36 Issue 12, p72-78. 7p.
Publication Year :
2023

Abstract

The task of generating news text summarizations aims to solve the problems of wasting time and read- ing fatigue caused by users'inability to quickly grasp the key points of the content when reading news. At present, the best text summarization model for Chinese is the T5 PEGASUS model, but there are few researches on this model. In this study, the Chinese word segmentation of the T5 PEGASUS model is improved, and the Pkuseg word segmentation method, which is more suitable for news field, is used for processing, and its effectiveness is verified on three public datasets with different news lengths: NLPCC2017, LCSTS and SogouCS. It is found that the Pkuseg method is more suitable for the T5 PEGASUS model. The ROUGE value of T5 Pegasus model generated summaries is positively correlated with the length of news text, and the loss value of training set and the decline speed of loss value are negatively correlated with the length of news text. In the face of a small number of training sets, the model can get a high ROUGE score, so the model has a strong few - shot learning ability. [ABSTRACT FROM AUTHOR]

Details

Language :
Chinese
ISSN :
10077820
Volume :
36
Issue :
12
Database :
Academic Search Index
Journal :
Electronic Science & Technology
Publication Type :
Academic Journal
Accession number :
174086789
Full Text :
https://doi.org/10.16180/j.cnki.issn1007-7820.2023.12.010