
Beyond the Final Draft: Integrating Generative AI Into Second Language Writing Assessment
Johanathan Woodworth, Mount Saint Vincent University, Canada
Statement of the Problem
Generative artificial intelligence (AI) is now common in education. It can write essays, generate feedback, and suggest prompts. In second language writing classrooms, these tools bring both benefits and risks. For example, an English learner might use ChatGPT to model an argumentative introduction and then write their own version to build awareness of organisation.
At the same time, AI often follows a standardised form of English. It may discourage creative or culturally specific expression and favour “native-like” patterns. As a result, students can lose confidence in their own voices, and inequalities can increase through limited access or linguistic bias (Flores & Rosa, 2015).
Professional development in language teaching often focuses on using new technologies rather than integrating them ethically or pedagogically (Liu et al., 2023). This emphasis on technical skills can leave teachers unprepared to address authorship, fairness, and language diversity in assessment.
Background
Research on second language writing shows that effective assessment supports writing as a process of drafting, feedback, and revision that builds awareness and responsibility for improvement (Ferris, 2014). Effective assessment balances product and process, encouraging peer and self-review to strengthen reflection.
Generative AI can support this process by providing models, suggestions, and feedback that expose writers to new ways of expressing ideas. However, AI-generated text often follows narrow patterns of standard academic English, limiting linguistic diversity and personal voice (Albeihi & Rice, 2025). Without careful guidance, students may rely too heavily on AI feedback and lose opportunities for independent revision and decision-making.
Using AI ethically requires what researchers describe as “AI literacy” (Chiu et al., 2024). This involves understanding how AI works, recognising bias, and integrating it in ways that promote fairness and linguistic diversity. Teachers in this study emphasised the importance of assessment criteria that view writing as a reflective process and make AI use visible in student work.
Research Design
This study explored how language teachers understand and use generative artificial intelligence in writing assessment. It asked two main questions:
- How does participation in professional development influence teachers’ knowledge and confidence with AI?
- How do these changes affect their views of assessment, feedback, and their professional roles?
To answer these questions, fifteen in-service English language teachers took part in a three-part professional development series held online over three consecutive weeks. Each 90-minute session included group discussion, demonstrations of AI tools, and reflection activities.
- Session 1 introduced what generative AI can and cannot do in writing classrooms, focusing on feedback and text modelling.
- Session 2 examined bias, fairness, and authorship, and how to adapt rubrics to account for AI use.
- Session 3 addressed accessibility and inclusion, helping teachers plan equitable and transparent assessment practices.
Between sessions, participants completed short reflection tasks and online discussions connecting workshop ideas to their own teaching.
Participants
Participants represented a range of teaching contexts and experience levels, as shown in Table 1. Most were experienced teachers working in primary or secondary education. Few had formal training in AI or digital assessment, which made this an important learning opportunity.
Table 1
Participant Demographics and Teaching Background
|
Characteristic |
n |
% |
|
Female |
11 |
73.3 |
|
Male |
4 |
26.7 |
|
Teaching Experience 1–3 years |
2 |
13.3 |
|
Teaching Experience 4–7 years |
2 |
13.3 |
|
Teaching Experience 8–15 years |
5 |
33.3 |
|
Teaching Experience 15+ years |
6 |
40.0 |
|
Elementary/Primary |
10 |
66.7 |
|
Middle School |
3 |
20.0 |
|
High School |
2 |
13.3 |
This group reflected a range of career stages, with most teachers having more than eight years of experience. Many described feeling confident in pedagogy but less so in digital assessment or AI integration.
Data Sources and Analysis
Teacher learning was measured using a modified version of the AI Literacy Scale for Teachers (Younis, 2025). The 45-item survey used a five-point scale to assess knowledge, skills, and ethical awareness related to AI. Five items focused specifically on assessment, such as understanding AI-generated feedback and designing fair evaluation tasks.
Teachers also completed three structured written reflections—one after each session—about how AI might change their approaches to feedback and assessment. These reflections provided insight into teachers’ evolving understanding of fairness, authorship, and professional identity.
Survey results and written reflections were then analysed together to trace how teachers’ AI literacy and assessment perspectives developed through the workshops.
Analysis of Results
Quantitative Trends
AI literacy scores improved by an average of 1.13 points on a 5-point scale. The largest gain was in understanding how AI tools can provide feedback (+1.33). Teachers also became more familiar with assessment technologies (+1.20) and more confident using data from these tools to inform teaching (+1.20). Confidence that AI could positively affect learning rose by 1.00, while understanding of personalised assessment for diverse learners increased modestly (+0.47).
These results show that participants grew more confident in applying AI for feedback and instructional decisions but remained uncertain about designing assessments that adapt to diverse learner needs.
Table 2
Pre- and Post-Intervention Mean Scores on Assessment-Related AI Literacy Items (n = 15)
|
Item |
Pre Mean |
Post Mean |
Change |
|
I have used or am familiar with AI-based tools for student assessment purposes. |
2.27 |
3.47 |
1.20 |
|
I understand how AI-based assessment tools could provide feedback on student performance. |
2.73 |
4.07 |
1.33 |
|
I know how to interpret data from assessment tools to inform instructional decisions. |
2.40 |
3.60 |
1.20 |
|
I believe that technology-enhanced assessment can positively impact student learning. |
3.20 |
4.20 |
1.00 |
|
I understand the concept of personalized assessment approaches for diverse learners. |
3.67 |
4.13 |
0.47 |
Qualitative Patterns
Five key themes emerged from teachers’ reflections on AI in second language writing assessment: feedback and process, rubric adaptation, voice and diversity, equity and accessibility, and teacher role.
1. Feedback and Process Orientation
Teachers increasingly described AI as a formative tool rather than a grading device. Many asked students to include drafts and AI transcripts to show how feedback was incorporated. One participant stated, “I believe we need to … promote and reward the process of learning in balance with the final product. Maybe … make them turn in chat transcripts of their collaborative work with AI.” These strategies reflected an effort to foster metacognitive engagement and ongoing revision.
2. Rubric Adaptation and Transparency
Teachers argued that rubrics should assess how thoughtfully students engage with AI, not just the quality of the final text. As one noted, “Students should not only show their final essay but also demonstrate how they used AI responsibly throughout the process.” Several teachers observed that student disclosure of AI use declined over time. One explained, “After a few weeks, students stopped mentioning AI unless specifically prompted.” Teachers saw this as a sign that AI use was becoming invisible within classroom practice and suggested adding reflective statements or rubric criteria to ensure ongoing transparency.
3. Voice and Linguistic Diversity
Concerns about voice and linguistic diversity were frequent. Teachers worried that AI flattens style by favouring standardised English. One teacher shared, “Some AI tools flatten language choices and favour native-like grammar, which risks erasing the rich voices of our multilingual students.” Another emphasised, “AI platforms may privilege specific grammar structures or accents … I’d want to be sure the tool doesn’t exclude students by promoting too narrow an idea of language proficiency.” As one teacher concluded, “We must resist AI’s push toward standardisation and instead foster an environment where every student’s unique voice and linguistic identity are honoured.”
4. Equity and Accessibility
Equity concerns were central to discussions. Teachers recognised that while AI can provide support, it can also reproduce inequities for students with limited digital access or language proficiency. To counter this, some encouraged flexible approaches. One explained, “I will encourage students to use speech-to-text tools for writing … I will also support students in using translanguaging methods to express their understanding.” In practice, these strategies positioned AI as a scaffold rather than a gatekeeper, allowing students to use multiple languages and modes to engage with writing tasks.
5. Teacher Role and Agency
Although some teachers expressed concern about losing control over assessment, most viewed themselves as responsible mediators between students and AI tools. One reflected, “AI should support our teaching, not replace it … [C]lear discussions on what GenAI can and can’t do … are important so students have a clearer sense of what they should or should not do.” Teachers saw their role as setting boundaries and modelling critical engagement with technology rather than simply adopting or rejecting it.
Discussion
From Tool Adoption to Critical Mediation
At the start of the workshops, teachers mainly focused on the technical benefits of AI, such as time savings and efficiency. By the final sessions, their attention had shifted toward more complex issues of fairness, bias, and student voice. This evolution suggests that structured reflection and peer dialogue help teachers move from using AI as a tool to critically mediating its role in learning (Selwyn, 2019).
Assessment Redesign
Teachers proposed new ways to document AI use in writing tasks, including process portfolios that combine prompts, AI outputs, and student revisions. This approach aligns assessment with broader learning goals such as critical thinking and rhetorical awareness rather than focusing only on final products. It also increases validity by making AI involvement visible and discussable in grading.
Equity and Multilingual Inclusion
Concerns about bias and linguistic dominance highlighted the need for more inclusive assessment practices. Teachers recommended rubrics that recognise communicative effectiveness and rhetorical adaptability rather than penalising non-standard English. Designing assessments that include multilingual exemplars and allow diverse forms of expression can help ensure fairness across linguistic backgrounds.
Professional Learning as Dialogue
Collaborative exchange played an important role in teacher learning. Peer discussions in forums and small-group sessions encouraged participants to question assumptions and share new ideas about managing AI in assessment. This dialogic process deepened teachers’ ethical reasoning and positioned professional development as a shared construction of knowledge rather than a one-directional training model.
Conclusion
This study shows that professional learning focused on AI in writing assessment can lead to concrete pedagogical change. Teachers moved from viewing AI primarily as a time-saving tool to using it deliberately to support feedback, reflection, and student agency. They developed greater confidence in designing assessment tasks that integrate AI transparently, emphasise process and voice, and address issues of bias and accessibility.
Key findings include:
- Increased teacher capacity to design AI-mediated writing activities that foster metacognitive awareness and maintain student authorship.
- Wider use of rubrics and reflection tasks that make AI use explicit and assess it critically.
- Heightened attention to equity, accessibility, and linguistic diversity when evaluating AI-supported writing.
Future research should connect these teacher learning outcomes with student writing evidence to examine how such shifts influence assessment practice in real classrooms. The goal is not to ban or normalise AI, but to integrate it in pedagogically principled and ethically transparent ways.
References
Albeihi, H. H. M., & Rice, M. F. (2025). Generative AI and language diversity: Implications for teachers and learners. Arab World English Journal, 16(1). https://ssrn.com/abstract=5202809
Chiu, T. K., Ahmad, Z., Ismailov, M., & Sanusi, I. T. (2024). What are artificial intelligence literacy and competency? A comprehensive framework to support them. Computers and Education Open, 6, 100171. https://doi.org/10.1016/j.caeo.2024.100171
Ferris, D. R. (2014). Responding to student writing: Teachers’ philosophies and practices. Assessing Writing, 19, 6-23. https://doi.org/10.1016/j.asw.2013.09.004
Flores, N., & Rosa, J. (2015). Undoing appropriateness: Raciolinguistic ideologies and language diversity in education. Harvard Educational Review, 85(2), 149-171. https://doi.org/10.17763/0017-8055.85.2.149
Liu, T., Zhang, Z., & Gao, X. (2023). Pedagogical design in technology-enhanced language education research: A scoping review. Sustainability, 15(7), 6069. 6069; https://doi.org/10.3390/su15076069
Selwyn, N. (2019). Should robots replace teachers? AI and the future of education. Polity.
Younis, B. (2025). The Artificial Intelligence Literacy (AIL) scale for teachers: a tool for enhancing ai education. Journal of Digital Learning in Teacher Education, 41(1), 37-56. https://doi.org/10.1080/21532974.2024.2441682
AI Use Statement
Artificial intelligence (OpenAI’s ChatGPT) was used in the preparation of this manuscript to assist with word count reduction. All intellectual content, analysis, and interpretation were developed by the author. The author reviewed and verified all AI-assisted edits for accuracy and alignment with the intended meaning.
Johanathan Woodworth is Assistant Professor of Educational Technology at Mount Saint Vincent University in Halifax, Canada. His research focuses on epistemology, pedagogy, and the integration of artificial intelligence into language assessment and teacher education, with particular attention to professional development, critical AI literacy, and ethical uses of educational technologies.
