Optimizing AI for Assessing L2 Writing Accuracy: An Exploration of Temperatures and Prompts
This chapter is part of: Chapelle, C. A., Beckett, G. H., & Ranalli, J. (Eds.). (2024). Exploring artificial intelligence in applied linguistics. Iowa State University Digital Press. https://doi.org/10.31274/isudp.2024.154.
Download ChapterDescription |
---|
This study investigates the impact of temperature and prompt settings on ChatGPT-4 in assessing second language (L2) writing accuracy. Building on Pfau et al. (2023), we used a corpus of 100 essays by L2 writers of English and examined how three temperature settings (0, 0.7, 1) and two prompt types (defined, undefined) influenced ChatGPT-4’s performance in error detection compared to human coding. Results indicated that ChatGPT-4, while generally underestimating error counts compared to human coders, showed a strong positive correlation with human coding across various settings. Notably, prompts with a detailed definition of errors yielded higher correlation coefficients (ρ = 0.826 to 0.859) than those without (ρ = 0.692 to 0.702), suggesting that more detailed prompts enhance ChatGPT-4’s performance. Descriptive statistics showed that with a less-detailed prompt, the error detection ability of ChatGPT-4 was nearly identical across temperature settings, yet with a more detailed prompt, ChatGPT-4’s performance was slightly better at higher temperatures. We discuss the importance of temperature in relation to prompt specificity for reliable L2 writing accuracy assessment and provide suggestions for optimizing AI tools such as ChatGPT-4 for assessing L2 writing accuracy. |
-
Details
Published Published By Pages DOI July 31, 2024 Iowa State University Digital Press 24 10.31274/isudp.2024.154.10 License Information ©2024 The authors. Published under a CC BY license. Citation Xu, Y., Polio, Ch., & Pfau, A. (2024). Optimizing AI for assessing L2 writing accuracy: An exploration of temperatures and prompts. In C. A. Chapelle, G. H. Beckett, & J. Ranalli (Eds.), Exploring artificial intelligence in applied linguistics (pp. 151–174). Iowa State University Digital Press. https://doi.org/10.31274/isudp.2024.154.10.