Anne O'Keeffe

Dr Anne O'Keeffe is a Senior Lecturer in Applied Linguistics at Mary Immaculate College, University of Limerick, Ireland. Her research output includes papers, chapters and books on Corpus Linguistics and Language Teaching, Pragmatics and Media Discourse. These include Investigating Media Discourse (2006, Routledge), From Corpus to Classroom (2007, Cambridge University Press, with Michael McCarthy and Ronald Carter), English Grammar Today (2011, Cambridge University Press, with Ronald Carter, Michael McCarthy and Geraldine Mark), Introducing Pragmatics in Use (2011, Routledge, with Brian Clancy and Svenja Adolphs).  She also co-edited the Routledge Handbook of Corpus Linguistics (with Michael McCarthy) and is currently working on its second edition. She was one of the Principal Investigator on the English Grammar Profile, a research project commissioned by Cambridge University Press which explored the Cambridge Learner Corpus so as to build an online grammar competency framework resourceShe has also guest edited a number of international journals, most recently Corpus Pragmatics and she is co-editor of two Routledge book series, Routledge Corpus Linguistic Guides and Routledge Applied Corpus Series. Dr O'Keeffe was also responsible for the establishment of the Inter-Varietal Applied Corpus Studies (IVACS) research centre and this has grown, over the years, into a vibrant research network.


Using corpus linguistics to explore learner English

Learner corpora of English have been amassing for more than two decades (Granger et al 2015). Research from this body of data has informed our understanding of morphology, lexis, syntax and so on, across L1 and within L2 use (e.g. Murakami 2013). Learner corpus studies also look at how the development of individual language items can differ across learners from different L1 backgrounds, at different stages of learning (e.g. Thewissen 2013). However, researching Second Language Acquisition (SLA) longitudinally is challenging, especially when one moves beyond the examination of single linguistic items. The plausibility of ever gaining a balanced and sizable sample of individual (second) language learners, and learner-directed language, within their full learning environment for all of their time as language learners, is highly unlikely. This means researchers can only view SLA longitudinally via case studies (e.g. Myles, Mitchell and Hopper 1999) or via quasi-longitudinal corpora (Meunier 2015).

Building a quasi-longitudinal corpus means using a variable such as age, year of study or level of proficiency as a proxy for the calibration of time (Meunier 2015). In this way, the language of learners, at a given age, year of study or proficiency level, can be examined and compared, from different L1 backgrounds. This paper will focus on the shift from the use of age and year of study as the primary calibration for learner corpora to the more recent model of using the six levels of the Common European Framework of Reference (CEFR) as a calibration tool for learner corpus design (see O’Keeffe and Mark 2017). Though this model is not without its flaws (Meunier 2015), it is argued that the CEFR does offer a more robust proxy for time in the design of quasi-longitudinal learner corpora than age or year of study. Using the 55 million-word CEFR-calibrated Cambridge Learner Corpus, this paper will explore models for the longitudinal analysis of language development and acquisition. It will make links to the emerging work which draws on usage-based models of language acquisition and will argue that examining learner English using quasi-longitudinal corpora needs more consideration and conceptualisation in terms of research design. The paper will make the case that to gain a more prototypical overview of second language acquisition, we need to move beyond the examination of single linguistic items.


Granger, S., Gilquin, G., & Meunier, F. (Eds.) (2015). The Cambridge Handbook of Learner Corpus Research. Cambridge University Press.

Meunier, F. (2015). Developmental patterns in learner corpora. In Granger, S., Gilquin, G., & Meunier, F. (Eds.), The Cambridge Handbook of Learner Corpus Research (pp. 379-400). Cambridge: Cambridge University Press.

Murakami, A. (2013). Cross-linguistic influence on the accuracy order of L2 English grammatical morphemes. Twenty years of learner corpus research. Looking back, moving ahead: Corpora and Language in Use, 1, 325-334.

Myles, F., Mitchell, R. &  Hopper, J. (1999). Interrogative chunks in French L2: a basis for creative construction?. Studies in Second Language Acquisition, 21(1), 49-80.

O’Keeffe, A. & Mark, G. (2017). The English Grammar Profile of learner competence: Methodology and key findings. International Journal of Corpus Linguistics 22(4), 457-489.

Thewissen, J. (2013). Capturing L2 accuracy developmental patterns: insights from an error tagged learner corpus. The Modern Language Journal, 97, 77–101.