This article describes how to implement a simple lexical decision experiment to assess written word recognition in neurologically healthy participants and in individuals with dementia and cognitive decline. We also provide a detailed description of reaction time analysis using principal components analysis (PCA) and mixed-effects modeling.
Older adults are slower at recognizing visual objects than younger adults. The same is true for recognizing that a letter string is a real word. People with Alzheimer's disease (AD) or Mild Cognitive Impairment (MCI) demonstrate even longer responses in written word recognition than elderly controls. Despite the general tendency towards slower recognition in aging and neurocognitive disorders, certain characteristics of words influence word recognition speed regardless of age or neuropathology (e.g., a word’s frequency of use). We present here a protocol for examining the influence of lexical characteristics on word recognition response times in a simple lexical decision experiment administered to younger and older adults and people with MCI or AD. In this experiment, participants are asked to decide as quickly and accurately as possible whether a given letter string is an actual word or not. We also describe mixed-effects models and principal components analysis that can be used to detect the influence of different types of lexical variables or individual characteristics of participants on word recognition speed.
Words are stored in the mental lexicon in a highly interconnected network. The connections between words may reflect shared properties, such as semantic similarity (e.g., dog and cat), form similarity (dog and fog), or frequent co-occurrence in common language use (e.g., dog and leash). Cognitive theories of language, such as usage-based theory1, argue that every encounter of a word by a language user has an effect on the word’s mental representation. According to Exemplar Theory, a word’s representation consists of many exemplars, which are built up from individual tokens of language use and which represent the variability that exists for a given category. The frequency of use2 impacts representations in memory by contributing to the strength of an exemplar1.
Word recognition speed can reveal the characteristics of the mental lexicon. A commonly used experimental paradigm for measuring the speed of word recognition is the lexical decision task. In this task, participants are presented with letter strings on a monitor, one at a time. They are instructed to decide as quickly as possible whether the letter string on the screen is a real word or not by pressing the corresponding button.
By examining reaction times for real words, researchers can address a number of important questions about language processing. For example, identifying which factors make recognition faster can test hypotheses about the structure of the mental lexicon and reveal its architecture. Moreover, comparisons of performance across different groups of participants can help us understand the influence of various types of language experience, or, in the case of aging or neurodegenerative diseases (e.g., Alzheimer’s disease), the role of cognitive decline.
Some factors (e.g., the frequency of use) exhibit greater influence on word recognition than other factors (e.g., word length). With advancing age, the way people recognize written words might change3,4. Younger adults tend to rely heavily on semantic (meaning-based) aspects of a word, such as how many compounds (e.g., bulldog) or derived words (e.g., doggy) share aspects of both form and meaning with the target word (in this case, dog). Word recognition for older adults appears to be more influenced by form-based aspects, such as the frequency that two subsequent letters co-occur in the language (e.g., the letter combination st occurs more often in English words than the combination sk).
To determine the factors that influence the word recognition speed across different groups, the researcher can manipulate certain variables in the stimulus set and then test the power of these variables to predict word recognition speed. For example, to test whether word recognition is driven by semantic or form-based factors, the stimulus set should include variables that reflect the degree of connectivity of a word to its semantic neighbors in the mental lexicon or its connectivity to other words that share part of its form.
This method was used in the current study to investigate whether word recognition speed is influenced by different factors in younger and older adults and in individuals with Alzheimer’s disease (AD) or mild cognitive impairment (MCI)3. The method described here is based on visual word recognition but can be adapted to the auditory modality. However, some variables that are significant predictors of reaction times in a typical visual lexical decision experiment might not predict response latencies in an auditory lexical decision or may have the opposite effect. For example, the phonological neighborhood has the opposite effect across these two modalities5: words with larger phonological neighborhoods exhibit a facilitatory effect on visual word recognition but result in longer response latencies in auditory lexical decision6.
Word-finding difficulties in older adults7 have been generally attributed to difficulty accessing the phonological word form rather than a breakdown of the semantic representation8. However, AD research has primarily focused on semantic declines9,10,11,12,13,14. It is important to disentangle how semantic and orthographic factors influence the recognition of written words in aging with and without cognitive decline. The influence of form-related factors is more pronounced in older than in younger adults, and it remains significant in people with MCI or AD3. Thus, this methodology can help us uncover features of the mental lexicon across different populations and identify changes in the lexicon’s organization with age and neuropathology. One concern when testing patients with neuropathology is that they may have difficulties accessing task-related knowledge. However, the lexical decision task is a simple task with no burden on working memory or other complex cognitive skills that many patients exhibit problems with. It has been considered appropriate for AD and MCI populations.
The protocol follows the guidelines of the Ethics Committee of the Hospital District of Northern Savo (IRB00006251).
1. Participant screening
2. Stimulus construction
3. Experimental design
4. Experimental procedure
5. Analyzing data with a mixed-effects model in R
NOTE: Many different statistical programs can be used to perform the analysis. This section describes steps for analyzing data in R24.
Table 1 shows a list of variables that were obtained from three different sources (a corpus, a dictionary, and pilot testing of test items) that are included in the analysis as fixed-effect predictors. Many of these variables have been previously reported to affect word recognition speed.
Corpus: | |
Base frequency | the number of times a word appears in the corpus in all its different forms (e.g., child and children) |
Bigram frequency | the average number of times that all combinations of two subsequent letters occur in the corpus |
Morphological family size | the number of derived and compound words that share a morpheme with the noun |
Morphological family frequency | the summed base frequency of all morphological family members |
Pseudo-morphological family size | includes not only “true” morphological family members but also words that mimic morphological family members in their orthographic form, whether or not they are actual morphemes, and thus represents orthographic overlap but not necessarily semantic overlap |
Pseudo-morphological family frequency | the summed base frequency of all pseudo-morphological family members |
Surface frequency | the number of times a word appears in the corpus in exactly the same form (e.g. child). |
Trigram frequency | the average number of times that all combinations of three subsequent letters occurs in the corpus |
Dictionary: | |
Hamming distance of one | the number of words of the same length but differing only in any single letter36 |
Length | the number of letters |
Orthographic neighborhood density | the number of words with the same length but differing only in the initial letter37,38 |
Pilot testing: | Sixteen participants indicated on a six-point scale (from 0 to 5) their estimates for each of the target words on the following parameters. |
As proper name | how often the word is seen as a proper name (e.g., as a family name, like Baker)39 |
Concreteness | the directness with which words refer to concrete entities40 |
Familiarity rating | how familiar the word is |
Imageability | the ease and speed with which words elicit mental images40 |
Table 1. The variables included in the mixed-effects analysis as fixed-effect predictors, obtained from three different sources (a corpus, a dictionary, and pilot testing of test items).
The number of explanatory variables can be smaller or bigger depending on the research questions and on the availability of the variables from databases, dictionaries, or corpora. However, including a large number of lexical features as predictors might lead to complications in the form of collinearity between predictors, when predictors correlate with each other and thus exert similar effects on the outcome measure. For example, concreteness and imageability of words may be highly correlated. An assumption in any linear regression analysis is that the predictor variables are independent of each other. However, as more variables are added to the model, the risk that some of the variables are not independent of each other increases. The higher the correlation between the variables, the more harmful this collinearity can be for the model41. A potential consequence of collinearity is that the significance level of some predictors may be spurious.
To avoid the effect of collinearity between predictors, the number of predictors should be reduced. If two predictors show collinearity, only one of them should be included in the model. However, if more than two predictors show collinearity, then excluding all but one would lead to a loss of variance explained. On the one hand, a researcher might reduce the number of explanatory variables already in the experimental design a priori, leaving only those that are hypothesis driven (theoretically motivated) and that permit the researcher to test hypotheses between different populations. On the other hand, sometimes there is no existing theory, and thus, it is reasonable to use Principal Component Analysis (PCA)41 to reduce the number of predictors by combining predictors that have similar effects into components. In this analysis, the predictor space was orthogonalized and the principal components of the new space were used as predictors (following steps described here41 on pages 118-126). One disadvantage of using PCA is that sometimes the components make it difficult to disentangle the effects of multiple predictors; they might all emerge with strong loadings on the same principal component.
We transformed all lexical predictors into five principal components to examine how word recognition speed might be different for younger adults and older adults. However, only two of them were significant in the young adults’ data (Table 3): PC1 and PC4. Three principal components (PCs) were significant predictors in the model for elderly controls (Table 4), MCI (Table 5) and individuals with AD (Table 6).
PC2 | |
Bigram freq. | -0.390 |
Hamming distance of one | -0.350 |
Final trigram freq. | -0.330 |
Neighborhood density | -0.320 |
Length | -0.226 |
Initial trigram freq. | -0.224 |
Pseudo-family size (final) | -0.124 |
Pseudo-family freq.(final) | -0.052 |
Family freq. (compounds) | -0.042 |
Family size (compounds) | -0.039 |
Family freq. (derived words) | -0.036 |
Family size (derived words) | -0.034 |
Surface freq. | -0.023 |
Base freq. | -0.008 |
Pseudo-family size (initial) | 0.070 |
Familiarity rating | 0.093 |
As proper name | 0.102 |
Pseudo-family freq. (initial) | 0.113 |
Concreteness | 0.275 |
Imageability | 0.296 |
Pseudo-family size (internal) | 0.296 |
Pseudo-family freq. (internal) | 0.316 |
Table 2. The rotation matrix for PC2. The loadings are the degree to which each variable contributes to the component. This table has been modified with permission from Cortex3.
Table 2 presents the lexical variables with their loadings on PC2. The strongest positive loadings of PC2 were pseudo-family size and frequency for overlap in the internal position. The strongest negative loadings were bigram frequency, Hamming distance of one, final trigram frequency, and orthographic neighborhood density. Since all of these variables are primarily form-based rather than meaning-based, PC2 is interpreted as reflecting the influence of form-based aspects of a word on word recognition speed.
Table 3 shows the results of the mixed-effects analysis for young adults (31 participants). Since PC2 was not a significant predictor of young adults’ response times (see Table 3), this seems to indicate that these form-based variables have less influence on the young adults’ reaction times compared to older adults’, including those with AD or MCI.
Fixed effects | Estimate | Std.Error | t-value | p-value |
(Intercept) | -1.31 | 0.05 | -26.36 | <0.001 |
Allomorphs | -0.034 | 0.015 | -2.3 | 0.024 |
PC1 | -0.021 | 0.004 | -5.179 | <0.001 |
PC4 | -0.042 | 0.008 | -5.224 | <0.001 |
Random effects | ||||
Groups | Name | Variance | Std.Dev. | Corr |
Item | (Intercept) | 0.009 | 0.095 | |
Subject | (Intercept) | 0.032 | 0.179 | |
PC1 | 4.765e-05 | 0.007 | 0.08 | |
Residual | 0.005 | 0.235 | ||
Number of obs. 2862; Item, 99; Subject, 31 |
Table 3. Estimated coefficients, standard errors, and t- and p-values for the mixed models fitted to the response latencies elicited for real words for young adults. This table has been modified with permission from Cortex3.
The Estimate for a fixed-effect variable can be interpreted as the amount by which the dependent variable (RT) increases or decreases if the value of this fixed effect changes. If the Estimate is negative, it means the variable correlates negatively with reaction times (the higher the variable, the smaller (faster) the reaction times). The t-value should typically be less than -2 or greater than 2 in order for the predictor to be significant.
Table 4, Table 5, and Table 6 show the results of the mixed-effects analysis for elderly controls (17 participants), individuals with MCI (24 participants), and individuals with AD (21 participants).
One interesting difference between the three elderly groups emerged: education significantly predicted speed of word recognition in elderly controls (Table 4; the estimate for Education is negative, which means that more years of education was associated with faster reaction times) and individuals with MCI (Table 5), but not in individuals with AD (Table 6; Education was dropped from the model since it was not a significant predictor), although there was no obvious difference in the variability of years of education among these groups (AD: mean 10.8 years, SD 4.2, range 5-19; MCI: mean 10.4 years, SD 3.5, range 6-17; elderly controls: mean 13.7 years, SD 3.7, range 8-20).
Fixed effects | Estimate | Std.Error | t-value | p-value |
(Intercept) | -0.72 | 0.157 | -4.574 | <0.001 |
Allomorphs | -0.022 | 0.01 | -2.14 | 0.035 |
PC1 | -0.011 | 0.003 | -4.122 | <0.001 |
PC2 | -0.011 | 0.005 | -2.223 | 0.029 |
PC4 | -0.02 | 0.006 | -3.687 | <0.001 |
Educação | -0.024 | 0.011 | -2.237 | 0.041 |
Random effects | ||||
Groups | Name | Variance | Std.Dev. | |
Item | (Intercept) | 0.003 | 0.057 | |
Subject | (Intercept) | 0.026 | 0.16 | |
Residual | 0.033 | 0.181 | ||
Number of obs. 1595; Item, 99; Subject, 17 |
Table 4. Estimated coefficients, standard errors, and t- and p-values for the mixed models fitted to the response latencies elicited for real words for elderly controls. This table has been modified with permission from Cortex3.
Fixed effects | Estimate | Std.Error | t-value | p-value |
(Intercept) | -0.562 | 0.114 | -4.922 | <0.001 |
PC1 | -0.009 | 0.003 | -3.218 | 0.002 |
PC2 | -0.013 | 0.005 | -2.643 | 0.01 |
PC4 | -0.018 | 0.006 | -3.078 | 0.003 |
Educação | -0.039 | 0.01 | -3.708 | 0.001 |
Random effects | ||||
Groups | Name | Variance | Std.Dev. | |
Item | (Intercept) | 0.003 | 0.056 | |
Subject | (Intercept) | 0.03 | 0.174 | |
Residual | 0.061 | 0.248 | ||
Number of obs. 2227; Item, 99; Subject, 24 |
Table 5. Estimated coefficients, standard errors, and t- and p-values for the mixed models fitted to the response latencies elicited for real words for individuals with MCI. This table has been modified with permission from Cortex3.
Fixed effects | Estimate | Std.Error | t-value | p-value |
(Intercept) | -0.876 | 0.051 | -17.017 | <0.001 |
Allomorphs | -0.018 | 0.009 | -2.008 | 0.048 |
PC1 | -0.011 | 0.003 | -4.097 | <0.001 |
PC2 | -0.011 | 0.004 | -2.718 | 0.008 |
PC4 | -0.018 | 0.005 | -3.751 | <0.001 |
Random effects | ||||
Groups | Name | Variance | Std.Dev. | Corr |
Trial | (Intercept) | 0.001 | 0.034 | |
Item | (Intercept) | 0.002 | 0.049 | |
Subject | (Intercept) | 0.045 | 0.212 | |
PC1 | 4.138e-05 | 0.006 | 0.83 | |
Residual | 0.026 | 0.162 | ||
Number of obs. 1879; Item, 99; Subject, 21 |
Table 6. Estimated coefficients, standard errors, and t- and p-values for the mixed models fitted to the response latencies elicited for real words for individuals with AD. This table has been modified with permission from Cortex3.
The study reported here addressed an additional question: whether the number of stem allomorphs associated with a word influences the speed of word recognition42,43. Stem allomorphs are different forms of a word stem across various linguistic contexts. For example, in English, foot has two stem allomorphs, foot and feet. In other words, the word stem changes depending on whether it is in the singular or plural form. The study described here tested speakers of Finnish, a language that has quite a bit more complexity in its stem changes compared to English. Words with greater stem allomorphy (i.e., words with more changes to their stems) elicited faster reaction times in all groups (Table 3, Table 4, and Table 6; the estimates for the number of allomorphs were negative, which means the higher the number of allomorphs a word had, the faster the reaction times it elicited) except the MCI group (Table 5; the number of allomorphs was not a significant predictor and hence was dropped from the model).
By using a simple language task that does not require language production, the present study investigated the impact of various lexical variables on word recognition in neurologically healthy younger and older adults, as well as in people with Alzheimer’s disease or Mild Cognitive Impairment. The age range used for recruiting “older adults” might depend on the specific research interests; however, the range for the healthy elderly group should match as closely as possible the age range and distribution for individuals with MCI or AD recruited for the same study.
To avoid collinearity between predictors, the lexical variables were orthogonalized into principal components and added to the mixed-effects models, where reaction times served as the dependent variable. The combination of a simple lexical decision experiment and a mixed-effects regression analysis led to the novel finding that the language difficulties for patients with AD may be attributed not only to changes to the semantic system but also to an increased reliance on word form. Interestingly, a similar pattern was found for people with Mild Cognitive Impairment and cognitively healthy elderly. This suggests that an increased reliance on form-based aspects of language processing might be part of a common age-related change in written word recognition.
In a factorial design, researchers traditionally create two or more sets of words that differ according to the variable of interest and then match these sets of words on a number of other lexical characteristics that may influence processing speed. The assumption is that any behavioral difference obtained between these two sets of words should be attributed to the manipulated (i.e., unmatched) variable. One problem with this type of design is that it is very difficult to match sets of words on more than a few variables. Another problem is that there might be some potentially significant variables that the word sets were not matched on or could not be matched on for a variety of reasons. Also, the factorial design treats continuous phenomena as if they are dichotomous factors. The use of mixed-effects models for statistical analysis of the behavioral data permits the researcher to include potentially important lexical variables as explanatory variables without the need to match words or lists of words according to these variables. In a mixed-effects model the variables Subject (participant code/number), Item (experimental stimuli), and Trial (trial number) are added as random effects. The random intercepts were included because it is assumed that subjects vary in their overall reaction times (i.e., some participants are naturally slower or faster across the board)
This methodology can be applied to other types of questions and to other populations, e.g., multilinguals or individuals with aphasia. For the former group, language processing may differ from monolinguals, so this variable should be considered if recruiting a mixed-language population, either by restricting recruitment to only one type of group or by comparing results later to determine whether language background influenced results.
The authors have nothing to disclose.
We thank Minna Lehtonen, Tuomo Hänninen, Merja Hallikainen, and Hilkka Soininen for their contribution to the data collection and processing reported here. The data collection was supported by VPH Dementia Research enabled by EU, Grant agreement No. 601055.
E-Prime | Psychology Software Tools | version 2.0.10.356. | |
PC with Windows and Keyboard | |||
R | R Foundation for Statistical Computing | R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. |