BM-PROMA is a valid and reliable multimedia diagnostic tool that can provide a complete cognitive profile of children with mathematical learning disabilities.
Learning mathematics is a complex process that requires the development of multiple domain-general and domain-specific skills. It is therefore not unexpected that many children struggle to stay at grade level, and this becomes especially difficult when several abilities from both domains are impaired, as in the case of mathematical learning disabilities (MLD). Surprisingly, although MLD is one of the most common neurodevelopmental disorders affecting schoolchildren, most of the diagnostic instruments available do not include assessment of domain-general and domain-specific skills. Furthermore, very few are computerized. To the best of our knowledge, there is no tool with these features for Spanish-speaking children. The purpose of this study was to describe the protocol for the diagnosis of Spanish MLD children using the BM-PROMA multimedia battery. BM-PROMA facilitates the evaluation of both skill domains, and the 12 tasks included for this purpose are empirically evidence-based. The strong internal consistency of BM-PROMA and its multidimensional internal structure are demonstrated. BM-PROMA proves to be an appropriate tool for diagnosing children with MLD during primary education. It provides a broad cognitive profile for the child, which will be relevant not only for diagnosis but also for individualized instructional planning.
One of the crucial objectives of primary education is the acquisition of mathematical skills. This knowledge is highly relevant, as we all use mathematics in our everyday lives, for example, to calculate change given at the supermarket1,2. As such, the consequences of poor mathematical performance go beyond the academic. At the social level, a strong prevalence of poor mathematical performance within the population constitutes a cost to society. There is evidence that improvement of poor numerical skills in the population leads to significant savings for a country3. There are also negative consequences at an individual level. For example, those who show a low level of mathematical skills present poor professional development (e.g., higher rates of employment in poorly paid manual occupations and higher unemployment)4,5,6, frequently report negative socio-emotional responses towards academics (e.g., anxiety, low motivation towards academics)7, 8, and tend to present poorer mental and physical health than their peers with average mathematical achievement9. Students with mathematical learning disabilities (MLD) show very poor performance that persists over time10,11,12. As such, they are more likely to suffer the consequences mentioned above, especially if these are not promptly diagnosed13.
MLD is a neurobiological disorder characterized by severe impairment in terms of learning basic numerical skills despite adequate intellectual capacity and schooling14. Although this definition is widely accepted, the instruments and criteria for its identification are still under discussion15. An excellent illustration of the absence of a universal agreement regarding MLD diagnosis is the variety of reported prevalence rates, ranging from 3 to 10%16,17,18,19,20,21. This difficulty in diagnosis stems from the complexity of mathematical knowledge, which requires that a combination of multiple domain-general and domain-specific skills be learned22,23. Children with MLD show very different cognitive profiles, with a broad constellation of deficits14, 24,25,26,27. In this regard, it is suggested that the need for multidimensional assessment by means of tasks involving different numerical representations (i.e., verbal, Arabic, analogic) and arithmetical skills11.
In primary school, symptoms of MLD are diverse. In terms of domain-specific skills, it is consistently found that many MLD students show difficulties in basic numerical skills, such as quickly and accurately recognizing Arabic numerals28,29,30, comparing magnitudes31,32, or representing numbers on the number line33,34. Primary school children have also shown difficulty in understanding conceptual knowledge, such as place value35, arithmetic knowledge36, or ordinality measured through ordered sequences37. Regarding domain-general skills, particular focus has been put on the role of working memory38,39 and language40 in the development of mathematical skills in children with and without MLD. In relation to working memory, the results suggest that students with MLD show a deficit in the central executive, especially when required to manipulate numerical information41,42. A deficit in visuospatial short-term memory has also been frequently reported in children with MLD43,44. Language skills have been found to be a prerequisite for learning numeracy skills, especially those that involve high verbal processing demand7. For example, phonological processing skills [e.g., phonological awareness and Rapid Automatized Naming (RAN)] are closely linked to those basic skills learned in primary school, such as numerical processing or arithmetic calculation39,45,46,47. Here, it has been demonstrated that variations in phonological awareness and RAN are associated with individual differences in numeracy skills that involve managing verbal code42,48. In light of the complex profile of children with MLD, a diagnostic tool should ideally include tasks that assess both domain-general and domain-specific skills, which are reported as being more frequently deficient in these children.
In recent years, several paper-and-pencil screening tools for MLD have been developed. Those most commonly used with Spanish primary school children are a) Evamat-Batería para la Evaluación de la Competencia Matemática (Battery for the Evaluation of Mathematical Competency)49; b) Tedi-Math: A Test for Diagnostic Assessment of Mathematical Disabilities (Spanish adaptation)50; c) Test de Evaluación Matemática Temprana de Utrecht (TEMT-U)51,52, the Spanish version of the Utrecht Early Numeracy Test53; and d) Test of early math abilities (TEMA-3)54. These instruments measure many of the domain-specific skills mentioned above; however, none of them assess domain-general skills. Another limitation of these instruments – and of paper-and-pencil tools in general – is that they cannot provide information regarding the accuracy and automaticity with which each item is processed. This would only be possible with a computerized battery. However, very few applications have been developed for dyscalculia diagnosis. The first computerized tool designed to identify children (aged 6 to 14) with MLD was the Dyscalculia Screener55. A few years later, the web-based DyscalculiUm56 was developed with the same purpose but focused on adults and learners in post-16 education. Although still limited, there has been growing interest in computerized tool design for the diagnosis of MLD in recent years57,58,59,60. None of the tools mentioned have been standardized for Spanish children, and only one of them – the MathPro Test57– includes domain-general skill evaluation. Given the importance of identifying children with low mathematical achievement, especially those with MLD, and in the absence of computerized instruments for the Spanish population, we present a multimedia evaluation protocol that includes both domain-general and domain-specific skills.
This protocol was conducted in accordance with the guidelines provided by the Comité de Ética de la Investigación y Bienestar Animal (Research Ethics and Animal Welfare Committee, CEIBA), Universidad de La Laguna.
NOTE: The Batería multimedia para la evaluación de habilidades cognitivas y básicas en matemáticas [Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics (BM-PROMA)]61 was developed using Unity 2.0 Professional Edition and the SQLITE Database Engine. BM-PROMA includes 12 subtests: 8 to assess domain-specific skills and 4 to evaluate domain-general processes. For each subtest, provide instructions orally by an animated humanoid robot and precede the testing phase with a demonstration and two training trials. The application protocol for each task is presented below with an example.
1. Experimental setup
2. Domain-specific subtests
3. Domain-general subtests
In order to test the utility and effectiveness of this diagnostic tool, its psychometric properties were analyzed in a large-scale sample. A total of 933 Spanish primary school students (boys = 508, girls = 425; Mage = 10 years, SD = 1.36) from grade 2 to grade 6 (grade 2, N = 169 [89 boys]; grade 3, N = 170 [89 boys]; grade 4, N = 187 [106 boys]; grade 5, N = 203 [113 boys]; grade 6, N= 204 [110 boys]) participated in the study. The children were from intact classes at state and private schools in urban and suburban areas of Santa Cruz de Tenerife. Students were classified into two groups: a) MLD children with scores within or below the 16th percentile in a standardized arithmetic test (grade 2, N =14; grade 3, N =35; grade 4, N =11; grade 5, N = 47; grade 6, N = 42); and b) typically achieving children with scores within or above the 40th percentile in the same test (grade 2, N =130; grade 3, N =124; grade 4, N =149; grade 5, N = 110; grade 6, N = 105).
The multidimensionality of the tool's structure was tested by means of Confirmatory Factor Analysis (CFA) using the lavaan package in R68. A five-factor model for BM-PROMA was hypothesized. A cognitive factor containing all domain-general tasks was expected, as the contribution of domain-general skills to mathematical performance is different from that of domain-specific skills69,70. An arithmetic factor grouping only arithmetic tasks was also expected, as arithmetic and basic numerical skills involve different cognitive and brain correlates71 . Finally, following the Triple Code Model72, three factors grouping numerical tasks according to whether the task involves verbal, Arabic or analogic representations were expected.
Evidence concerning internal consistency was assessed using Cronbach's alpha. Cronbach's alphas were calculated for all measures and presented both for each grade and for the whole participant sample. Internal consistency values were considered excellent when α ≥ .80, good when α ≥ .70 and <.80, acceptable when α ≥ .60 and <.70, poor when α ≥ .50 and < .60, and unacceptable when α < .5073.
Model goodness of fit was estimated using the robust maximum likelihood (RML) estimation method and assessed using the following indexes74,75: standardized root mean square (SRMS ≤ .08), chi-square (χ2, p> .05), Tucker-Lewis index (TLI ≥ .90), comparative fit index (CFI ≥ .90), root mean square error of approximation (RMSA ≤ .06), and Composite Reliability (ω ≥ .60). Modification indices (MI) were inspected.
Descriptive statistics were examined and are presented in Table 1. Results showed a normal distribution of the data, with kurtosis and skewness indexes lower than 10.00 and 3.00, respectively76.
Measures | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Total | |||||||
M | SD | M | SD | M | SD | M | SD | M | SD | M | SD | ||
Missing numbers | 3.81 | 3.29 | 5.79 | 3.49 | 7.68 | 3.15 | 7.56 | 3.50 | 8.33 | 2.98 | 6.67 | 3.65 | |
Two-digit number comparison | 2.02 | .54 | 1.78 | .35 | 1.50 | .24 | 1.46 | .15 | 1.44 | .15 | 1.62 | .38 | |
Reading numbers | 1.14 | .27 | 1.27 | .23 | 1.14 | .21 | 1.17 | .18 | 1.21 | .20 | 1.24 | .24 | |
Place value | 8.83 | 3.19 | 9.83 | 2.89 | 10.58 | 1.62 | 10.33 | 1.95 | 10.89 | 1.49 | 10.14 | 2.38 | |
Number line 0-100 | .11 | .06 | .07 | .30 | .06 | .02 | .05 | .02 | .05 | .19 | .07 | .04 | |
Number line 0-1000 | .18 | .09 | .13 | .06 | .09 | .04 | .09 | .04 | .07 | .02 | .11 | .06 | |
Addition fact retrieval | 5.11 | 4.42 | 7.03 | 5.24 | 11.15 | 5.74 | 10.27 | 5.82 | 12.03 | 5.30 | 9.32 | 5.93 | |
Subtraction fact retrieval | 4.36 | 3.79 | 5.78 | 4.66 | 8.94 | 4.53 | 8.64 | 4.84 | 9.76 | 4.31 | 7.66 | 4.89 | |
Multiplication fact retrieval | 2.92 | 3.27 | 6.32 | 4.97 | 11.48 | 5.67 | 10.10 | 5.90 | 11.49 | 5.43 | 8.72 | 6.13 | |
Arithmetic principles | 8.33 | 4.71 | 8.05 | 3.41 | 8.95 | 3.80 | 9.38 | 4.01 | 10.78 | 4.56 | 9.21 | 4.22 | |
Counting Span | 4.57 | 2.35 | 5.45 | 2.65 | 6.41 | 2.56 | 6.43 | 2.59 | 7.03 | 2.49 | 6.05 | 2.67 | |
Visuospatial working memory | 6.26 | 2.74 | 7.30 | 2.62 | 8.18 | 2.33 | 8.46 | 2.42 | 9.27 | 2.23 | 7.98 | 2.66 | |
Phoneme deletion | 9.34 | 4.78 | 10.96 | 4.60 | 12.64 | 2.83 | 12.62 | 2.92 | 12.61 | 3.37 | 11.73 | 3.94 | |
Rapid automatized naming- Letters | 1.37 | .32 | 1.53 | .31 | 1.72 | .31 | 1.80 | .35 | 1.87 | .36 | 1.68 | .38 |
Table 1: Descriptive statistics of BM-PROMA subtests per grade.
The internal consistency of each measure, except for numerical working memory, is presented in Table 2. Results indicated α of above .70 for the majority of the measures at each grade, suggesting good to excellent internal consistency for most of the tasks.
Measures | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Total | ICL | |
Missing numbers | .841 | .843 | .807 | .858 | .801 | .861 | 1 | |
Two-digit number comparison | .891 | .925 | .916 | .868 | .866 | .895 | 1 | |
Reading numbers | .861 | .830 | .849 | .892 | .753 | .855 | 1-2 | |
Place value | .843 | .864 | .722 | .686 | .740 | .809 | 1-3 | |
Number line 0-100 | .825 | .748 | .658 | .547 | .678 | .801 | 1-4 | |
Number line 0-1000 | .806 | .820 | .763 | .743 | .729 | .867 | 1-2 | |
Addition fact retrieval | .852 | .879 | .885 | .892 | .856 | .898 | 1 | |
Subtraction fact retrieval | .826 | .880 | .846 | .868 | .823 | .876 | 1 | |
Multiplication fact retrieval | .811 | .861 | .867 | .881 | .853 | .901 | 1 | |
Arithmetic principles | .586 | .734 | .844 | .742 | .866 | .821 | 1-4 | |
Visuospatial working memory | .741 | .726 | .660 | .695 | .699 | .747 | 1-3 | |
Phoneme deletion | .918 | .933 | .835 | .853 | .899 | .911 | 1 | |
Note. ICL = internal consistency level; 1 = excellent; 2 = good; 3 = acceptable; 4 = poor, 5 = unacceptable. |
Table 2: Cronbach's a coefficient for all the measures at each grade.
In order to confirm the factorial structure of BM-PROMA, a CFA was conducted using the RML estimation method. The fit indices suggested an adequate fit of the five-factor model proposed for the data: χ2 = 29.930 df = 67, p = .000; CFI = .948; TLI = .930; RMSEA = .053, 90% CI = [.046-.061]; SRMR = .046; F1, ω = .50; F2, ω = .75; F3, ω = .80; F4, ω = .81; F5, ω = .46 (Figure 11).
Figure 11: Confirmatory Factor Analysis of the BM-PROMA. Note. F1 = Arabic numerical representation factor; F2 = analogical representation factor; F3 = verbal representation factor; F4 = arithmetic factor; F5 =cognitive factor; RAN-L = rapid automatized naming- letters; VWM = visuospatial working memory; CS = counting span; PD = phoneme deletion; AP = arithmetic principles; MFR= multiplication fact retrieval; AFR = addition fact retrieval; SFR = subtraction fact retrieval; TNC = two-digit number comparison; RN = reading numbers; NL-100 = number line 0-100; NL-1000 = number line 0-1000; PV = place value; MN = missing number. Please click here to view a larger version of this figure.
The multidimensional approach of the tool was confirmed. The tasks included in BM-PROMA loaded on five factors: 1) the missing number and place value tasks loaded on the "Arabic Numerical Representation Factor"; 2) the number line estimation 0-100 and number line estimation 0-1000 tasks loaded on the "Analogical Representation Factor"; 3) the two-digit number comparison and reading number tasks loaded on the "Verbal Representation Factor"; 4) the arithmetic principles, addition fact retrieval, multiplication fact retrieval, and subtraction fact retrieval tasks loaded on the "Arithmetic Factor"; and 5) the counting span, phoneme deletion, RAN-L, and visuospatial working memory tasks loaded on the "Cognitive Factor".
In order to examine measurement invariance across grades, we split the sample into two groups. The first group was composed of students from grades 2-3 (Group A). The second group was composed of students from grades 4-6 (Group B). Students were regrouped to increase sample size and minimize the number of groups, as sample characteristics, the number of groups compared, and model complexity all affect measurement invariance77. Four nested models were compared: configural (equivalence of model form), metric (equivalence of factor loading), scalar (equivalence of item intercept), and strict (equivalence of item residual). Results are presented in Table 3, which shows configural, metric, scalar, and strict invariance across groups.
Model | χ2 | df | CFI | TLI | RMSEA (90% CI) | SRMR | ΔCFI | ΔRMSEA | ||
Configural (structure) | 364.145 | 134 | .940 | .918 | .061 [.053 – . 068] | .051 | ||||
Metric (loadings) | 383.400 | 143 | .937 | .920 | .060 [.053 – .067] | .056 | - .003 | -.001 | ||
Scalar (intercepts) | 383.845 | 152 | .939 | .927 | .057 [.050 – .064] | .056 | .002 | -.003 | ||
Strict (residuals) | 398.514 | 166 | .939 | .933 | .055 [.048 – .062] | .056 | .000 | -.002 | ||
Note. CFI = comparative fit index; TLI = tucker-lewis index, RMSEA = root mean square error of approximation; | ||||||||||
CI = confidence interval; SRMR = standardized root mean square residual; Δ = difference. | ||||||||||
All χ2 values are significant at p < 0.001. |
Table 3: Fit indices for measurement Invariance of BM-PROMA.
Finally, Receiver Operating Characteristic (ROC) analysis was performed to study the diagnostic accuracy of BM-PROMA based on the five factors derived from the CFA analysis. The standardized Prueba de Cálculo Numérico (Arithmetic Computation Test)78 was used as the gold standard for testing the accuracy of each single diagnostic measure (i.e., factors). Area Under the ROC Curve (AUC > .70), sensitivity (>.70) and specificity (> .80) values were explored79. Results revealed acceptable AUCs for all factors in all grades except for F3 (i.e., verbal representation factor) in grades 3, 5 and 6, and F2 (i.e., analogical representation factor) in grade 2 (Table 4). Sensitivity and specificity values were highly variable, ranging from .468 to .846 for sensitivity and from .595 to .929 for specificity. These results denote that although all measures contribute to the development of mathematical competency, their utility varies across grades.
Grade | Factors | AUC | Sn | Sp |
Grade 2 | F1 | .912 | .808 | .857 |
F2 | .902 | .785 | .929 | |
F3 | .746 | .823 | .786 | |
F4 | .906 | .846 | .929 | |
F5 | .918 | .838 | .929 | |
Grade 3 | F1 | .762 | .734 | .714 |
F2 | .736 | .645 | .800 | |
F3 | .608 | .468 | .743 | |
F4 | .753 | .605 | .771 | |
F5 | .733 | .556 | .743 | |
Grade 4 | F1 | .719 | .745 | .727 |
F2 | .694 | .597 | .727 | |
F3 | .817 | .705 | .818 | |
F4 | .775 | .691 | .818 | |
F5 | .782 | .678 | .727 | |
Grade 5 | F1 | .855 | .764 | .809 |
F2 | .810 | .736 | .745 | |
F3 | .630 | .527 | .681 | |
F4 | .835 | .745 | .809 | |
F5 | .832 | .855 | .787 | |
Grade 6 | F1 | .839 | .686 | .714 |
F2 | .776 | .648 | .738 | |
F3 | .524 | .486 | .595 | |
F4 | .891 | .848 | .905 | |
F5 | .817 | .752 | .738 |
Table 4: Diagnosis accuracy of BM-PROMA subtests per grade. Note. F1 = Arabic numerical representation factor; F2 = analogical representation factor; F3 = verbal representation factor ; F4 = arithmetic factor ; F5 = cognitive factor; AUC = area under the curve; Sn = sensitivity; Sp = specificity.
Figure 1: Missing number task Please click here to view a larger version of this figure.
Figure 2: Two-digit number comparison task Please click here to view a larger version of this figure.
Figure 3: Reading numbers task Please click here to view a larger version of this figure.
Figure 4: Place value task Please click here to view a larger version of this figure.
Figure 5: Number line estimation task Please click here to view a larger version of this figure.
Figure 6: Arithmetic fact retrieval task Please click here to view a larger version of this figure.
Figure 7: Arithmetic principles task Please click here to view a larger version of this figure.
Figure 8: Counting span task Please click here to view a larger version of this figure.
Figure 9: Rapid automatized naming – letter task (RAN-L) Please click here to view a larger version of this figure.
Figure 10: Visuospatial working memory task Please click here to view a larger version of this figure.
Children with MLD are at risk not only of academic failure but also of psycho-emotional and health disorders8,9 and, later on, of employment deprivation4,5. Thus, it is crucial to diagnose MLD promptly in order to provide the educational support that these children need. However, diagnosing MLD is complex due to the multiple domain-specific and domain-general skill deficits that underlie the disorder22,23. BM-PROMA is one of the few computerized tools that uses a multidimensional protocol to diagnose primary school children with MLD, and the first to be standardized for Spanish-speaking children.
The present study has proven that BM-PROMA is a valid and reliable instrument. Results from ROC analyses were promising, showing AUCs ranging from .72 to .92 across almost all factors and grades. This indicates acceptable to excellent discrimination79. The weakest support was found for F3 in grades 3, 5 and 6, and F2 in grade 4 yielded AUC < .70. It is important to note that we used only one measure as the gold standard, and that it is focused on multi-digit calculation skills; as such, it is a very limited measure. A gold standard should reflect the content of the criterion measure under investigation80, so we consider that the classification accuracy could be improved by the addition of other standardized state assessments in future studies.
Although BM-PROMA is a very comprehensive tool, it would be relevant for future versions to include other domain-specific skills that have been found to be impaired in MLD children, for example, non-symbolic comparison tasks in younger children81 and rational number manipulation or the solving of arithmetic word problems82,83 in older children. It would also be essential to incorporate other domain-general skills that seem to be deficient in MLD, such as inhibitory control84.
Despite the limitations described, BM-PROMA is one of the few pieces of software designed to identify children with dyscalculia, and the present study has proven that it is a valid and reliable instrument. The internal structure represents the tool's multidimensional evaluation approach. It provides a broad cognitive profile for the child, which is relevant not only for diagnosis but also for individualized instructional planning. Furthermore, its multimedia format is highly motivating for the children and, at the same time, makes the assessment procedure easier.
The authors have nothing to disclose.
We gratefully acknowledge the support of the Spanish government through its Plan Nacional I+D+i (R+D+i National Research Plan, Spanish Ministry of Economy and Competitiveness), project ref: PET2008_0225, with the second author as principal investigator; and CONICYT-Chile [FONDECYT REGULAR Nº 1191589], with the first author as principal investigator. We also thank the Unidad de Audiovisuales ULL team for their participation in the production of the video.
Multimedia Battery for Assessment of Cognitive and Basic Skills in Maths | Universidad de La Laguna | Pending assignment | BM-PROMA |