Estimating Learners’ Vocabulary Size under Item Response Theory
Aaron Gibson and Jeffrey Stewart
Kyushu Sangyo University
Perhaps the most qualitatively interpretable vocabulary test score is an
estimate of the total number of words the learner knows in the tested
domain, such as a frequency word list, or vocabulary taught as part of a
course curriculum. In cases where it is not possible to test the entire
domain word-for-word, vocabulary tests such as the vocabulary levels test
(Nation, 1990) and vocabulary size test (Beglar, 2010; Nation & Beglar,
2007) typically employ a polling method, in which total vocabulary size is
inferred from a sample of tested words. A drawback of this method is that
it assumes the tested words are randomly sampled from and therefore
representative of the tested domain, which can affect test reliability in
cases where there are many words in the domain that are far below or
above learners’ ability. This paper outlines an alternate method for
estimating vocabulary size from a test score using item response theory,
which allows estimation of total vocabulary size from a nonrandom
sample of words well matched to learners’ ability, resulting in tests of
practical length with high reliability that can be used to estimate the
total number of words a learner knows. Such a test scoring method,
currently in use at a private university in southern Japan, is used as an

Gibson, A., & Stewart, J. (2014). Estimating learners’ vocabulary size under item response theory. Vocabulary Learning and Instruction, 3 (2), 78-84. doi: 10.7820/vli.v03.2.gibson.stewart