Part-of-Speech Distribution across Proficiency and Advanced EFL Texts: A Quantitative Comparison for Pedagogical Application
Keywords:
POS tagging, corpus linguistics, computational linguistics, data analysis, PERMANOVAAbstract
This study investigated grammatical variation between Advanced Masterclass and Proficiency Masterclass EFL textbook and workbook texts to determine whether part-of-speech (POS) distributions change systematically across the CEFR C1–C2 interface. A balanced corpus of 60 reading texts (30 per level) was compiled, POS-tagged with spaCy, and analyzed quantitatively using Welch’s t, Mann–Whitney U, effect sizes, false-discovery-rate correction, and robust 20 % trimmed-mean tests. A multivariate PERMANOVA confirmed a small but significant global difference between levels (F = 2.624, p = .006, R² ? .03). Individual contrasts indicated that Proficiency texts contained relatively higher proportions of determiners and prepositions, while Advanced texts featured greater use of numerals, adjectives, and adverbs. Findings showed small but systematic differences: Proficiency texts used more cohesive, narrative-oriented grammar (determiners, pronouns, prepositions), while Advanced texts showed relatively greater use of informational or expository elements (numerals, comparative adjectives, adverbs). The study illustrates how transparent, code-based POS profiling can reveal subtle grammatical distinctions in pedagogical materials and support evidence-informed textbook evaluation. By combining classical, non-parametric, robust, and multivariate analyses, the approach ensures replicable results and provides a methodological template for future corpus-based research on advanced-level language input. The findings underscore the pedagogical value of aligning grammatical exposure with discourse progression from C1 to C2 in EFL instruction.
References
[1] Biber, D. (1988). Variation across speech and writing. Cambridge University Press. https://doi.org/10.1017/CBO9780511621024
[2] Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and written English. Pearson Education
[4] Oxford University Press. (2012). Advanced Masterclass: Student’s Book. OUP.
[5] Oxford University Press. (2012). Advanced Masterclass: Workbook. OUP.
[6] Oxford University Press. (2015). Proficiency Masterclass: Student’s Book. OUP.
[7] Oxford University Press. (2015). Proficiency Masterclass: Workbook. OUP.
[8] Römer, U. (2006). Pedagogical applications of corpora: Some reflections on the current scope and a wish list for future developments. Zeitschrift für Angewandte Linguistik, 44(2), 121-134.
[9] Römer, U., Cortes, V., & Friginal, E. (Eds.). (2020). Advances in corpus-based research on academic writing: Effects of discipline, register, and writer expertise. John Benjamins Publishing Company
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Journal of Sciences: Basic and Applied Research (IJSBAR)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who submit papers with this journal agree to the following terms.