1. The Convergent Validity of Mobile Learning Apps' Usability Evaluation by Popular Generative Artificial Intelligence (AI) Robots
- Author
-
Victor K. Y. Chan
- Abstract
This article seeks to explore the convergent validity of (and thus the consistency between) a few popular generative artificial intelligence (AI) robots in evaluating popular mobile learning apps' usability. The three robots adopted in the study were Microsoft Copilot, Google PaLM, and Meta Llama, which were individually instructed to accord rating scores to the eight major usability dimensions, namely, (1) content/course quality, (2) pedagogical design, (3) learner support, (4) technology infrastructure, (5) social interaction, (6) learner engagement, (7) instructor support, and (8) cost-effectiveness of 17 currently most popular mobile learning apps. For each of the three robots, the minimum, the maximum, the range, and the standard deviation of the rating scores for each of the eight dimensions were computed across all the mobile learning apps. The rating score difference for each of the eight dimensions between any pair of the above three robots was calculated for each app. The mean of the absolute value, the minimum, the maximum, the range, and the standard deviation of the differences for each dimensions between each pair of robots were calculated across all the apps. A paired sample t-test was then applied to each dimension for the rating score difference between each robot pair over all the apps. Finally, Cronbach's coefficient alpha of the rating scores was computed for each of the eight dimensions between all the three robots across all the apps. The computational results were to reveal whether the three robots awarded discrimination in evaluating each dimension across the apps, whether each robot, with respect to any other robot, erratically and/or systematically overrate or underrate any dimension over the apps, and whether there was high convergent validity of (and thus consistency between) the three robots in evaluating each dimension across the apps. Among other auxiliary results, it was revealed that the convergent validity of (and the consistency between) the three robots was marginally acceptable only in evaluating mobile learning apps' dimension of (1) content/course quality but not at all in the dimensions (2) pedagogical design, (3) learner support, (4) technology infrastructure, (5) social interaction, (6) learner engagement, (7) instructor support, and (8) cost-effectiveness. [For the full proceedings, see ED659933.]
- Published
- 2024