Back to Search Start Over

Predicting CEFR levels in learners of English: The use of microsystem criterial features in a machine learning approach

Authors :
Annanda Sousa
Thomas Gaillat
Manel Zarrouk
Nicolas Ballier
Bernardo Stearns
Manon Bouyé
Andrew J Simpkin
Linguistique, Ingénierie, Didactique des Langues (LIDILE)
Université de Rennes 2 (UR2)
Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)
Université de Rennes (UNIV-RENNES)
National University of Ireland [Galway] (NUI Galway)
Centre de Linguistique Inter-langues, de Lexicologie, de Linguistique Anglaise et de Corpus (CLILLAC-ARP (EA_3967))
Université Paris Diderot - Paris 7 (UPD7)
Université de Paris (UP)
Insight Centre for Data Analytics [Galway] (INSIGHT)
Source :
ReCALL, ReCALL, Cambridge University Press (CUP), 2021, pp.1-17. ⟨10.1017/S095834402100029X⟩
Publication Year :
2021
Publisher :
Cambridge University Press (CUP), 2021.

Abstract

This paper focuses on automatically assessing language proficiency levels according to linguistic complexity in learner English. We implement a supervised learning approach as part of an automatic essay scoring system. The objective is to uncover Common European Framework of Reference for Languages (CEFR) criterial features in writings by learners of English as a foreign language. Our method relies on the concept of microsystems with features related to learner-specific linguistic systems in which several forms operate paradigmatically. Results on internal data show that different microsystems help classify writings from A1 to C2 levels (82% balanced accuracy). Overall results on external data show that a combination of lexical, syntactic, cohesive and accuracy features yields the most efficient classification across several corpora (59.2% balanced accuracy).

Details

ISSN :
14740109 and 09583440
Volume :
34
Database :
OpenAIRE
Journal :
ReCALL
Accession number :
edsair.doi.dedup.....52e72c1a8290c49b765b26a393d38cfc