A COMPARATIVE EVALUATION OF AI-GENERATED FEEDBACK: GEMINI VS. CHATGPT IN ASSESSING EFL STUDENTS’ GRAMMATICAL RANGE AND ACCURACY

Authors

  • I Made Agung Rai Antara Department of Hotel Management, Triatma Mulya University
  • Ni Putu Yunik Anggreni Department of Tourism, Triatma Mulya University

DOI:

https://doi.org/10.59672/stilistika.v14i2.6280

Keywords:

Artificial Intelligence, Grammatical Assessment, ChatGPT, Gemini

Abstract

This study aims to compare the performance of ChatGPT 5.3 and Gemini Pro 3.1 in assessing Grammatical Range and Accuracy (GRA) in recount essays written by intermediate-level students at Triatma Mulya University. The study employed a comparative qualitative design using a content analysis approach to explore, evaluate, and compare the quality of grammatical feedback independently generated by both artificial intelligence systems. The data consisted of 15 recount essays written by intermediate-level students at Triatma Mulya University, which were analyzed based on error types, accuracy levels, and grammatical range. The findings revealed that both models demonstrated a high level of consistency in identifying major grammatical errors, particularly in tense usage, sentence structure, and capitalization. However, significant differences were found in the depth of analysis and sensitivity to minor errors. Gemini Pro 3.1 tended to provide more detailed and rule-based feedback, whereas ChatGPT 5.3 offered explanations that were simpler and easier for students to understand. Furthermore, Gemini exhibited a stricter evaluative tendency, while ChatGPT adopted a more moderate approach in classifying grammatical accuracy and range. These findings suggest that both systems possess strong potential for grammatical assessment, albeit with different orientations, making them complementary tools in English writing instruction.

Downloads

Download data is not yet available.

References

Aljuaid, H. (2024). The Impact of Artificial Intelligence Tools on Academic Writing Instruction in Higher Education: A Systematic Review. https://doi.org/10.31235/osf.io/ph24v

Alsariera, A. H., & Alsaraireh, M. Y. (2024). Advancing EFL Writing Proficiency in Jordan: Addressing Challenges and Embedding Progressive Strategies. International Journal of Arabic-English Studies. https://doi.org/10.33806/ijaes.v24i2.664

Anaktototy, K. (2023). Interplaying Reading and Writing in ESL/EFL: A Literature Review of Strategies for Indonesian Teachers. Elsya Journal of English Language Studies, 5(1), 107–121. https://doi.org/10.31849/elsya.v5i1.9994

Asnas, S. A. M., & Hidayanti, I. (2024). Uncovering EFL Students’ Frequent Difficulties in Academic Writing and the Coping Strategies: The Case of a College in Indonesia. Journal on English as a Foreign Language, 14(1), 124–151. https://doi.org/10.23971/jefl.v14i1.7472

Bhowmik, S. (2021). Writing Instruction in an EFL Context: Learning to Write or Writing to Learn Language? Belta Journal, 5(1), 30–42. https://doi.org/10.36832/beltaj.2021.0501.03

Cao, S., Zhou, S., Luo, Y., Wang, T., Zhou, T., & Xu, Y. (2022). A Review of the ESL/EFL Learners’ Gains From Online Peer Feedback on English Writing. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.1035803

Chen, H., & Pan, J. (2022). Computer or Human: A Comparative Study of Automated Evaluation Scoring and Instructors’ Feedback on Chinese College Students’ English Writing. Asian-Pacific Journal of Second and Foreign Language Education, 7(1). https://doi.org/10.1186/s40862-022-00171-4

Chick, J. C. (2025). Writing With AI at the Margins: Student Voice and Authenticity at a Minority-Serving Institution. https://doi.org/10.21203/rs.3.rs-8427622/v1

Crompton, H., Edmett, A., Ichaporia, N., & Burke, D. (2024). AI and English Language Teaching: Affordances and Challenges. British Journal of Educational Technology, 55(6), 2503–2529. https://doi.org/10.1111/bjet.13460

Danping, D. (2024). Tapping Into the Pedagogical Potential of infinigoChatIC: Evidence From iWrite Scoring and Comments and Lu &Amp;amp; Ai’s Linguistic Complexity Analyzer. https://doi.org/10.31235/osf.io/xnrtz

Dizon, G., & Gayed, J. M. (2021). Examining the Impact of Grammarly on the Quality of Mobile L2 Writing. The Jalt Call Journal, 17(2), 74–92. https://doi.org/10.29140/jaltcall.v17n2.336

Faisal, F., & Carabella, P. A. (2023). Utilizing Grammarly in an Academic Writing Process: Higher-Education Students’ Perceived Views. Journal of English Language Teaching and Linguistics, 8(1), 23. https://doi.org/10.21462/jeltl.v8i1.1006

Fithriani, R. (2018). Cultural Influences on Students’ Perceptions of Written Feedback in L2 Writing. Journal of Foreign Languange Teaching and Learning, 3(1). https://doi.org/10.18196/ftl.3124

Jabsheh, A.-A.-H. M. M. (2024). Relevancy and Outlook of the Technology-Enhanced Education Within Digital Contents, Resources and Tools. Ijmer, 3(1), 24–34. https://doi.org/10.32996/ijmer.2024.3.1.4

Liao, F.-Y. (2018). Prospective ESL/EFL Teachers’ Perceptions Towards Writing Poetry in a Second Language: Difficulty, Value, Emotion, and Attitude. Eurasian Journal of Applied Linguistics, 4(1), 1–16. https://doi.org/10.32601/ejal.460583

Lv, X., Ren, W., & Xie, Y. (2021). The Effects of Online Feedback on ESL/EFL Writing: A Meta-Analysis. The Asia-Pacific Education Researcher, 30(6), 643–653. https://doi.org/10.1007/s40299-021-00594-6

Mariappan, R., Tan, K. H., Yang, J., Jian, C., & Chang, P. K. (2022). Synthesizing the Attributes of Computer-Based Error Analysis for ESL and EFL Learning: A Scoping Review. Sustainability, 14(23), 15649. https://doi.org/10.3390/su142315649

Miles, M. B., Huberman, A. M., & Saldaña, J. (2014). Qualitative Data Analysis: A Methods Sourcebook (3rd ed.). SAGE Publications.

Mills, N., Hok, H., Dressen, A., & Veillas, Q. (2025). The Design and Evaluation of an Interactive AI Companion for Foreign Language Writing. Foreign Language Annals, 58(1), 40–69. https://doi.org/10.1111/flan.12790

Mun, C. (2024). EFL Learners’ English Writing Feedback and Their Perception of Using ChatGPT. Stem Journal, 25(2), 26–39. https://doi.org/10.16875/stem.2024.25.2.26

Ruecker, T., Shapiro, S., Johnson, E. N., & Tardy, C. M. (2014). Exploring the Linguistic and Institutional Contexts of Writing Instruction in TESOL. Tesol Quarterly, 48(2), 401–412. https://doi.org/10.1002/tesq.165

Saleem, T., Saleem, A., & Aslam, D. M. (2025). Integrating AI in Pakistani ESL Classrooms: Teachers’ Practices, Perspectives, and Impact on Student Performance. Plos One, 20(9), e0333352. https://doi.org/10.1371/journal.pone.0333352

Sanosi, A. (2022). To Err Is Human: Comparing Human and Automated Corrective Feedback. Information Technologies and Learning Tools, 90(4), 149–161. https://doi.org/10.33407/itlt.v90i4.4980

Susanti, A. (2017). Teachers’ Corrective Feedback on Students’ L2 Writing: State of the Art. Abjadia International Journal of Education, 2(2), 81–94. https://doi.org/10.18860/abj.v2i2.5364

Tambunan, A. R. S., Andayani, W., Sari, W. S., & Lubis, F. (2022). Investigating EFL Students’ Linguistic Problems Using Grammarly as Automated Writing Evaluation Feedback. Indonesian Journal of Applied Linguistics, 12(1), 16–27. https://doi.org/10.17509/ijal.v12i1.46428

Xu, Q., & Li, P. (2023). Computational Modeling of Language Learning in the Era of Generative Artificial Intelligence: A Response to Open Peer Commentaries. Language Learning, 73(S2), 83–94. https://doi.org/10.1111/lang.12605

Zhai, X., & Razali, A. B. (2023). Triple Method Approach to Development of a Genre-Based Approach to Teaching ESL/EFL Writing: A Systematic Literature Review by Bibliometric, Content, and Scientometric Analyses. Sage Open, 13(1). https://doi.org/10.1177/21582440221147255

Zhang, S., & Liu, X. (2025). Learner Emotions in AI-assisted English as a Second/Foreign Language Learning: A Systematic Review of Empirical Studies. Frontiers in Psychology, 16. https://doi.org/10.3389/fpsyg.2025.1652806

Zhou, Y. (2023). The Effectiveness of Automated Written Corrective Feedback on L2 Learners’ Revision Outcomes: A Case for ChatGPT. International Journal of New Developments in Education, 5(25). https://doi.org/10.25236/ijnde.2023.052511

Downloads

Published

2026-05-29

How to Cite

Antara, . I. M. A. R., & Anggreni, N. P. Y. . (2026). A COMPARATIVE EVALUATION OF AI-GENERATED FEEDBACK: GEMINI VS. CHATGPT IN ASSESSING EFL STUDENTS’ GRAMMATICAL RANGE AND ACCURACY. Stilistika : Jurnal Pendidikan Bahasa Dan Seni, 14(2), 237–247. https://doi.org/10.59672/stilistika.v14i2.6280