1. Early prediction of writing quality using keystroke logging
- Author
-
Luuk Van Waes, Rianne Conijn, Christine L. Cook, Menno van Zaanen, Human Technology Interaction, and EAISI Foundational
- Subjects
Academic writing ,Computer science ,computer.software_genre ,Keystroke logging ,Early prediction ,Education ,Task (project management) ,Educational sciences ,Computer. Automation ,060201 languages & linguistics ,business.industry ,4. Education ,05 social sciences ,Writing process ,Educational technology ,050301 education ,Contrast (statistics) ,Linguistics ,Regression analysis ,06 humanities and the arts ,Automatic summarization ,Writing quality ,Writing processes ,Computational Theory and Mathematics ,Literature ,0602 languages and literature ,Artificial intelligence ,business ,0503 education ,computer ,Natural language processing - Abstract
Feedback is important to improve writing quality; however, to provide timely and personalized feedback is a time-intensive task. Currently, most literature focuses on providing (human or machine) support on product characteristics, especially after a draft is submitted. However, this does not assist students who struggle during the writing process. Therefore, in this study, we investigate the use of keystroke analysis to predict writing quality throughout the writing process. Keystroke data were analyzed from 126 English as a second language learners performing a timed academic summarization task. Writing quality was measured using participants’ final grade. Based on previous literature, 54 keystroke features were extracted. Correlational analyses were conducted to identify the relationship between keystroke features and writing quality. Next, machine learning models (regression and classification) were used to predict final grade and classify students who might need support at several points during the writing process. The results show that, in contrast to previous work, the relationship between writing quality and keystroke data was rather limited. None of the regression models outperformed the baseline, and the classification models were only slightly better than the majority class baseline (highest AUC = 0.57). In addition, the relationship between keystroke features and writing quality changed throughout the course of the writing process. To conclude, the relationship between keystroke data and writing quality might be less clear than previously posited.
- Published
- 2022