Detecting Aberrant Responses in Automated L2 Spoken English Assessment
This chapter is part of: Chapelle, C. A., Beckett, G. H., & Ranalli, J. (Eds.). (2024). Exploring artificial intelligence in applied linguistics. Iowa State University Digital Press. https://doi.org/10.31274/isudp.2024.154.
Download ChapterDescription |
---|
Automated marking systems are increasingly used in computer-based semi-direct speaking assessment to cater to the rising demand of marking efficiency. However, aberrant responses, such as incomprehensible speech or nonsensical answers, pose a threat to the accuracy of these systems. Using candidate responses from the Linguaskill Business speaking assessment, this study trained and evaluated two classifiers, namely a feature-based multi-layer perceptron classifier and a BERT-based classifier, for detecting aberrations in the L2 English spoken data. Additionally, the study examined the effect of this detection on the accuracy of the Linguaskill automated marking system. The study found that aberrant responses occurred more frequently among low-proficiency candidates despite their scarcity in the spoken data. Both classifiers were found to be effective in detecting aberrant responses whilst the feature-based classifier performed slightly better. A significant increase in the performance of the Linguaskill automated marking system was observed by adding the feature-based classifier. |
-
Details
Published Published By Pages DOI July 31, 2024 Iowa State University Digital Press 22 10.31274/isudp.2024.154.07 License Information ©2024 The authors. Published under a CC BY license. Citation Gao, S., Gales, M., & Xu, J. (2024). Detecting aberrant responses in automated L2 spoken English assessment. In C. A. Chapelle, G. H. Beckett, & J. Ranalli (Eds.), Exploring artificial intelligence in applied linguistics (pp. 96–117). Iowa State University Digital Press. https://doi.org/10.31274/isudp.2024.154.07