Back to Search
Start Over
How to Evaluate Single-Round Dialogues Like Humans: An Information-Oriented Metric
- Source :
- IEEE/ACM Transactions on Audio, Speech, and Language Processing. 28:2211-2223
- Publication Year :
- 2020
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2020.
-
Abstract
- Developing a dialogue response generation system is one of important topics in natural language processing, but many obstacles are yet to be overcome before autogenerated dialogues with a human-like quality can become possible. A good evaluation method will help narrow the gap between machines and humans in dialogue generation. Unfortunately, the existing automatic evaluation methods are biased and correlate very poorly with human judgments of response quality. Such methods are incapable of assessing whether a dialogue response generation system can produce high-quality, knowledge-related and informative dialogues. In response to this challenge, we design an information-oriented framework to simulate human subjective evaluation. Using this framework, we implement a learning-based metric to evaluate the quality of a dialogue. An experimental validation demonstrates our proposed metric's effectiveness in dialogue selection and model evaluation on a Twitter dataset (in English) and a Weibo dataset (in Chinese). In addition, the metric is more relevant than the existing methods of dialogue evaluation to human subjective judgment.
- Subjects :
- Response generation
Acoustics and Ultrasonics
business.industry
Computer science
media_common.quotation_subject
Experimental validation
computer.software_genre
Machine learning
Chatbot
030507 speech-language pathology & audiology
03 medical and health sciences
Computational Mathematics
Information extraction
Metric (mathematics)
Evaluation methods
Computer Science (miscellaneous)
Selection (linguistics)
Quality (business)
Artificial intelligence
Electrical and Electronic Engineering
0305 other medical science
business
computer
media_common
Subjects
Details
- ISSN :
- 23299304 and 23299290
- Volume :
- 28
- Database :
- OpenAIRE
- Journal :
- IEEE/ACM Transactions on Audio, Speech, and Language Processing
- Accession number :
- edsair.doi...........751f6e0dbebdd3126ff64c23a8919027
- Full Text :
- https://doi.org/10.1109/taslp.2020.3003864