The Stylistic Fingerprint of AI: Mapping Lexical Bundles and N-Grams in Automated Academic Writing
Keywords:
IELTS, Large Language Models, Corpus Linguistics, Lexical Bundles, AI-ese, Academic WritingAbstract
Large Language Models (LLMs) have transformed students' approach to high-stakes English language assessment, including the widely used International English Language Testing System (IELTS). Although LLMs produce grammatically correct sentences, they are based on probabilistic models that focus on likely rather than unusual linguistic patterns rather than the idiosyncratic variations associated with natural human fluency. The present study uses a corpus-driven analysis to explore the formulaic stylistic pattern(s) that emerge from the IELTS Academic Writing Task 2 essays generated by LLMs (or “AI-ese”). This study builds and analyzes a specially designed corpus of 20 model essays and finds that the majority of the essays share a unique, algorithmically extracted stylistic fingerprint with a great deal of repetitive 3-word and 4-word lexical bundles. The findings reveal that the model always uses the pre-packaged rhetorical units, for instance, as is widely accepted and the modern era, to build up the illusion of academic objectivity. The aim of this study is to show that these formulaic sequences are structurally correct but the epistemic stance markers and authorial presence they generate are not sufficiently complex or sophisticated in accordance with the evaluation principles of IELTS. The findings indicate that relying on AI-generated models without critical evaluation for test preparation might be detrimental to the cultivation of true communicative ability and to the test takers' performance.

