ChatGPT for Writing Evaluation: Examining the Accuracy and Reliability of AI-Generated Scores Compared to Human Raters
This chapter is part of: Chapelle, C. A., Beckett, G. H., & Ranalli, J. (Eds.). (2024). Exploring artificial intelligence in applied linguistics. Iowa State University Digital Press. https://doi.org/10.31274/isudp.2024.154.
Download ChapterDescription |
---|
ChatGPT has proven beneficial in a variety of educational contexts, yet its effectiveness in scoring integrated second-language writing tasks remains uncertain. This study, therefore, explores the accuracy and reliability of ChatGPT-generated scores versus human ratings under two prompting conditions (with or without writing prompt and source texts) and examines the reasons behind rating discrepancies using a mixed methods approach. ChatGPT rated 74 argumentative essays from the Iowa State University English Placement Test Corpus of Learner Writing under the different prompting conditions; its ratings were then compared with those of human raters. Compared to human raters, ChatGPT’s reliability was moderate to low. This was the case in both prompting conditions. In addition, a qualitative analysis of ChatGPT’s scoring rationales suggested that, unlike human raters, ChatGPT was limited in detecting content- related issues and integrating source text information. The findings of the study suggest that a more rigorous process may be required to train ChatGPT to rate similarly to human raters. |
-
Details
Published Published By Pages DOI July 31, 2024 Iowa State University Digital Press 23 10.31274/isudp.2024.154.06 License Information ©2024 The authors. Published under a CC BY license. Citation Kim, H., Baghestani, Sh., Yin, Sh., Karatay, Y., Kurt, S., Beck, J., & Karatay, L. (2024). ChatGPT for writing evaluation: Examining the accuracy and reliability of AI-generated scores compared to human raters. In C. A. Chapelle, G. H. Beckett, & J. Ranalli (Eds.), Exploring artificial intelligence in applied linguistics (pp. 73-95). Iowa State University Digital Press. https://doi.org/10.31274/isudp.2024.154.06.