1. Exploring pathways to comprehension performance in multilanguage smart voice systems: insights from Lasso regression, SEM, PLS-SEM, CNN, and BiLSTM.
- Author
-
Gao, Entong, Guo, Jialu, Pang, Xipeng, Bo, Danya, and Chen, Zhe
- Subjects
INTEGRAL domains ,LINEAR operators ,STATISTICAL learning ,SMART speakers ,LANGUAGE ability - Abstract
Smart voice systems, such as voice assistants and smart speakers, are integral to domains such as smart homes, customer service, healthcare, and smart learning. The effectiveness of these systems relies on user comprehension performance, which is crucial for enhancing user experience. In this study, the primary factors influencing comprehension performance in multilanguage smart voice systems are examined, and the efficacy of various analytical methods, including LASSO regression, SEM, PLS-SEM, CNN, and BiLSTM, are assessed by identifying and improving these factors. Using a diverse dataset from human–computer interaction experiments made publicly available on GitHub, these five methods are applied to discern the impact of environmental and user-specific factors on comprehension. The key findings indicate the following: 1) Noise types and noise sound levels markedly affect comprehension. Noise sound level exhibited an inverted U-shape curve (parameter: 0.088) due to the low and high levels of noise. Certain rhythmic noises, such as those from clocks (parameter: 0.033), enhance comprehension by fostering a conducive auditory environment. 2) Analytical method comparisons reveal that while LASSO regression (MSE = 0.026), SEM, and PLS-SEM effectively map the linear relationships and pathways affecting comprehension, deep learning approaches such as CNN and BiLSTM (MSE = 0.019) excel at handling complex, multidimensional data, offering superior predictive performance.3) In a non-native language environment, the evaluation of user comprehension models is notably different from that in native language settings (native R
2 : 0.545; non-native R2 : 0.347). Specifically, in non-native language environments, the variables and mechanisms influencing user comprehension models are clearer, more controllable, and more susceptible to proficiency levels (parameter: 0.164). This comprehensive study presents a novel comparison of traditional statistical and machine learning methods in analyzing smart voice system interaction across languages. These findings emphasize the significance of tailoring smart voice systems to user diversity in language proficiency, age, and educational background and suggest optimizing these systems under varied environmental conditions to improve comprehension and overall effectiveness. The insights from this study are critical for policymakers and designers aiming to refine the adaptability and user-centric nature of smart voice systems. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF