Relationships Between Text Length and Lexical Diversity Measures: Can We Use Short Texts of Less than 100 Tokens?
Download this article (pdf)
Lexical diversity (LD) measures have been known to be sensitive to the length of the text, and numerous revised LD measures have been proposed. This study aims to identify LD measures that are least affected by text length and can be used for the analysis of short L2 texts (50-200 tokens). This study compares the type-token ratio, Guiraud index, D, and measure of textual lexical diversity (MTLD) to assess their degree of susceptibility to text length. Spoken texts of 200 tokens from 20 L2 English learners at the lower-intermediate-level were divided into segments of 50 to 200 tokens and the text length impact was examined. It was found that MTLD was least affected by text length, and that it should be used with texts of at least 100 tokens.
lexical diversity; text length; type-token ratio; Guiraud index; D; measure of textual lexical diversity (MTLD); speaking performance.
Koizumi, R.(2012).Relationships between text length and lexical diversity measures: Can we use short texts of less than 100 tokens? Vocabulary Learning and Instruction, 1(1), 6069. doi: 10.7820/vli.v01.1.koizumi