1. Neural Language Model-based Readability Assessment of Computer Science Introductory Texts for English-as-a-Second Language Learners
- Author
-
Ehara, Yo, Ehara, Yo, Ehara, Yo, and Ehara, Yo
- Abstract
English is the dominant language in computer science. In addition to English-based academic papers, English is frequently the only language provided in introduction sections and manuals of command and software libraries, which are essential aspects of computer programming. Hence, English-as-a-second-language (ESL) learners may have difficulty studying computer science because they must learn this field while also learning English. Despite this problem, few studies have assessed the difficulty level of computer science texts for ESL learners. Ideally, the difficulty levels of texts are assessed by having groups of ESL learners read them. However, owing to the excessive time and financial costs involved, such practices can be impractical. Hence, using two highly accurate automatic readability assessors based on natural language processing (NLP) techniques, we assessed the readability of various computer-science-related texts for ESL learners. The first assessor is based on state-of-the-art deep transfer learning, and the second is based on classical machine learning and applied linguistics. For training the assessors, we used a standard corpus employed in NLP, which was annotated by professional English teachers to evaluate the readability of the texts for ESL learners. To conduct the experiments, we built a collection of computer science texts ranging from academic papers to software manuals (READMEs) crawled from a source-code hosting website, namely GitHub. The experimental results showed that intermediate ESL learners were able to read most of the computer science related texts.
- Published
- 2022