Human vs. AI: The Use of ChatGPT in Writing Assessment

Human vs. AI: The Use of ChatGPT in Writing Assessment

DOI: 10.4018/979-8-3693-0353-5.ch009
OnDemand:
(Individual Chapters)
Available
$33.75
List Price: $37.50
10% Discount:-$3.75
TOTAL SAVINGS: $3.75

Abstract

The current study seeks to investigate whether ChatGPT 3.5 can be used as an aid to help diminish the teachers' workload in assessing writing. To this aim, a mixed-methods research design was employed for the study. Randomly selected, 20 descriptive essays written by freshman student teachers of English Language Teaching were scored by an experienced human rater and ChatGPT 3.5. An adapted ‘descriptive essay rubric' by the researchers was used to assess the descriptive essays of the student teachers. The quantitative aspect of the study involved frequency and percentage analysis, while the qualitative dimension centered on analyzing the written feedback provided by both ChatGPT and the human rater. The findings showed that there is a disagreement between ChatGPT 3.5 and the human rater. Furthermore, there are some problems with the written feedback it provides. It is clear that it is rapid in terms of providing feedback. Thus, it is recommended that ChatGPT 3.5 can be employed as a tool under the supervision of teachers.
Chapter Preview
Top

Introduction

The literature consistently emphasizes the importance of students practicing writing extensively to enhance their ability to express themselves effectively. However, it is acknowledged that teachers often face time constraints, hindering their ability to provide timely feedback and evaluate student assignments. This challenge has led to the expansion of the Automated Writing Evaluation (AWE), also accentuated as Automated Essay Evaluation and Automated Essay Scoring (Huawei & Aryadoust, 2023) and its supporters. AWE leverages Artificial Intelligence (AI) technology to rapidly score written work (Cushing Weigle, 2010; Warschauer & Grimes, 2008). “The sheer number of hours commenting on student papers is reduced dramatically when instructors can rely on automated electronic feedback systems” (Ware & Warschauer, 2006, p. 108). Whereas the studies have shown that AWE can be used as an aid for teachers to score writing (Wilson & Roscoe, 2020), it is worth noting that there remain a number of concerns raised, especially among writing educators (Cushing Weigle, 2010).

AWE systems make use of AI (Steiss et al., 20223), which can be specified as “natural language processing (NLP) and machine learning techniques” (Uto, 2021), examine written texts, and promptly produce ratings based on the writing quality. Furthermore, they provide written feedback to improve overall and specific aspects of writing (Cushing Weigle, 2010). While the current functionalities of AWE systems are impressive, it's worth tracing their evolution over the decades.

It was 1960s when Page (1966) mentioned the possibility of scoring the essays in an automated way with their project called ‘Project Essay Grade’. Three decades later, with the developments in computer and the Internet created possibilities for marketing globally (Warschauer & Grimes, 2008). Education Testing Service (ETS) developed a system which was called e-rater to score the essays written in TOEFL IBT (Cushing Weigle, 2010). Simultaneously, Intellimetric was developed by Vantage Learning. Another engine to score high-stake writing tasks was Intelligent Essay Assessor which was developed by a group of academics used a technique called latent semantic analysis. Pearson, a publishing company, later acquired this technology for their tests (Warschauer & Grimes, 2008). Apart from these, there are AWE tools which were designed for classroom use, including Writer’s Workbench, MY Access!, WriteToLearn, Criterion, and RWT (Saricaoglu, 2015). AWE systems make use of natural language processing (NLP) (Wison & Roscoe, 2020).

Complete Chapter List

Search this Book:
Reset