Enhancing Grading Quality with GraderGPT in Large Courses

University of Michigan School of Kinesiology

Problem

In large courses with multiple graders, maintaining consistent grading quality and limiting scoring variance can be challenging. Professor Peter Bodary aimed to address these issues by developing a system that assists human graders in ensuring more uniform grading standards and improving the clarity of grading criteria.

Audience

The AI tool, GraderGPT, was targeted at staff and student graders responsible for evaluating writing assignments in large courses.

Outcome/Impact

In collaboration with ITS, Professor Peter Bodary utilized student assignment submissions, assignment rubrics, grades, and graders’ notes from Canvas to build GraderGPT. This U-M GPT Toolkit-based system assists in grading writing assignments by dynamically building prompts using available grading criteria and submission information to grade and score all submissions in an assignment automatically.

GraderGPT does not replace human graders but supports them by providing a baseline for comparison, helping identify any variances in scoring. The system was tested in two different courses as part of an initial study. Preliminary feedback indicated that GraderGPT was useful in identifying scoring variances and improving the quality of grading criteria, leading to better clarity and instruction.