1. Do People and Neural Nets Pay Attention to the Same Words
- Author
-
Vladislav Blinov, W. Bruce Croft, Yukun Zheng, Falk Scholer, Mark Sanderson, and Valeria Bolotova
- Subjects
Correctness ,Computer science ,business.industry ,media_common.quotation_subject ,Factoid ,0211 other engineering and technologies ,02 engineering and technology ,010501 environmental sciences ,computer.software_genre ,01 natural sciences ,Gaze ,Task (project management) ,Annotation ,Completeness (order theory) ,021105 building & construction ,Eye tracking ,Quality (business) ,Artificial intelligence ,business ,computer ,Natural language processing ,0105 earth and related environmental sciences ,media_common - Abstract
We investigated how users evaluate passage-length answers for non-factoid questions. We conduct a study where answers were presented to users, sometimes shown with automatic word highlighting. Users were tasked with evaluating answer quality, correctness, completeness, and conciseness. Words in the answer were also annotated, both explicitly through user mark up and implicitly through user gaze data obtained from eye-tracking. Our results show that the correctness of an answer strongly depends on its completeness, conciseness is less important. Analysis of the annotated words showed correct and incorrect answers were assessed differently. Automatic highlighting helped users to evaluate answers quicker while maintaining accuracy, particularly when highlighting was similar to annotation. We fine-tuned a BERT model on a non-factoid QA task to examine if the model attends to words similar to those annotated. Similarity was found, consequently, we propose a method to exploit the BERT attention map to generate suggestions that simulate eye gaze during user evaluation.
- Published
- 2020