Precise measurement of neurological and neuropsychological impairment and disability in multiple sclerosis is challenging. We report methodologic details on a new test, the Multiple Sclerosis Performance Test (MSPT). This new approach to the objective of quantification of MS related disability provides a computer-based platform for precise, valid measurement of MS severity.
Precise measurement of neurological and neuropsychological impairment and disability in multiple sclerosis is challenging. We report a new test, the Multiple Sclerosis Performance Test (MSPT), which represents a new approach to quantifying MS related disability. The MSPT takes advantage of advances in computer technology, information technology, biomechanics, and clinical measurement science. The resulting MSPT represents a computer-based platform for precise, valid measurement of MS severity. Based on, but extending the Multiple Sclerosis Functional Composite (MSFC), the MSPT provides precise, quantitative data on walking speed, balance, manual dexterity, visual function, and cognitive processing speed. The MSPT was tested by 51 MS patients and 49 healthy controls (HC). MSPT scores were highly reproducible, correlated strongly with technician-administered test scores, discriminated MS from HC and severe from mild MS, and correlated with patient reported outcomes. Measures of reliability, sensitivity, and clinical meaning for MSPT scores were favorable compared with technician-based testing. The MSPT is a potentially transformative approach for collecting MS disability outcome data for patient care and research. Because the testing is computer-based, test performance can be analyzed in traditional or novel ways and data can be directly entered into research or clinical databases. The MSPT could be widely disseminated to clinicians in practice settings who are not connected to clinical trial performance sites or who are practicing in rural settings, drastically improving access to clinical trials for clinicians and patients. The MSPT could be adapted to out of clinic settings, like the patient’s home, thereby providing more meaningful real world data. The MSPT represents a new paradigm for neuroperformance testing. This method could have the same transformative effect on clinical care and research in MS as standardized computer-adapted testing has had in the education field, with clear potential to accelerate progress in clinical care and research.
Multiple sclerosis (MS) is an inflammatory disease of the central nervous system (CNS) affecting young adults, particularly women. Foci of inflammation occur unpredictably and intermittently in optic nerves, brain, and spinal cord. Episodic symptoms, termed relapses, characterize the early relapsing remitting stage of MS (RRMS). During the RRMS disease stage, irreversible CNS tissue injury accumulates, manifest as progressive brain atrophy and neurological disability. Brain atrophy in MS begins early in the disease and proceeds 2-8x faster than age and gender matched healthy controls1. Presumably because of brain reserve, and other compensatory mechanisms, clinically-significant neurological disability is generally delayed for years, typically for 10-20 years after symptom onset. During more advanced stages of MS, termed secondary progressive MS (SPMS), relapses occur less frequently or disappear entirely, but gradually worsening neurological disability ensues, and patients experience some combination of difficulty with walking, arm function, vision, or cognition.
Quantifying MS clinical disease activity and progression is challenging for a variety of reasons. First, clinical manifestations vary widely in different MS patients. Second, disease activity varies significantly over time in individual MS patients. Third, MS manifestations vary in early compared with late disease stages. Lastly, neurological and neuropsychological impairment and disability are inherently difficult to quantify. This topic has been reviewed periodically over the past 20 years2-4. One standard measure used in patient care and research is the number or frequency of relapses. The relapse rate has been used as the primary outcome measure for the vast majority of clinical trials for RRMS. Reduction in relapse rate has supported approval of 10 disease-modifying drugs across 6 drug classes. The number of relapses only weakly correlates with later clinically-significant disability, however, and it has proven difficult to accurately quantify relapse severity or recovery from relapse. The standard clinical disability scale – Kurtzke’s Expanded Disability Status Scale (EDSS)5– is a 20 point ordinal scale ranging from 0 (normal neurological exam) to 10 (dead from MS). From 0–4.0, EDSS is determined by the combination of scores on 7 functional systems. From 4.0–6.0 EDSS is determined by the ability to walk a distance. EDSS 6.0 is the need for unilateral walking assistance. EDSS 6.5 is the need for bilateral walking assistance. Nonambulatory patients are scored EDSS ≥7.0, with higher number reflecting increasing difficulty with mobility and ability to perform self-care. The EDSS has achieved world-wide acceptance by regulatory agencies as an acceptable disability measure for MS clinical trials, based partly on its long-standing use in the MS field, and familiarity to neurologists, but there are a number of limitations2,6. EDSS has been criticized as being non-linear, imprecise at the lower end of the scale, insensitive at the middle and upper ends, and too heavily dependent on ambulation.
Based on perceived shortcomings of the EDSS, an alternative approach to quantifying disability in MS patients was recommended in 1997 by a Task Force of the National Multiple Sclerosis Society (NMSS)7,8. This Task Force recommended a 3 part composite scale, the Multiple Sclerosis Functional Composite (MSFC), for MS clinical trials. As initially recommended, the MSFC consisted of a timed measure of walking (the 25 ft timed walk [WST]), a timed measure of arm function (the 9 hole peg test [9HPT]), and a measure of information processing speed (the 3 sec version of the Paced Auditory Serial Addition Test – PASAT-39,10). Each measure was normalized to a reference population to create a component z-score, and the individual z-scores were averaged to create a composite score representing the severity of the individual patient relative to the reference population. The MSFC has not been accepted by regulatory agencies as a primary disability outcome measure, in part because the clinical meaning of a z-score or z-score change has not been clear. Also, the MSFC has been criticized because it lacks a visual function measure and because the PASAT is poorly accepted by patients. In response to these perceived shortcomings, an expert group11, convened by the National MS Society, recommended two modifications to the MSFC: 1) inclusion of the Sloan Low Contrast Letter Acuity test12 and 2) replacement of the PASAT-3 with the oral version of the Symbol Digit Modalities Test (SDMT)13,14. This expert group also recommended that the revised MSFC become the primary disability outcome measure to replace the EDSS in future MS clinical trials11. An effort is currently underway to achieve regulatory agency acceptance of a new disability outcome measure, based on quantitative measures of neuroperformance15.
It is clear that new methods are needed to improve outcomes assessment in the MS field. This paper describes development of a novel clinical disability outcome assessment tool, the Multiple Sclerosis Performance Test (MSPT), which builds on the MSFC approach, but which also merges advances in computer and information technology, biomechanics, human performance testing, and distance health. The MSPT application uses the iPad as a data collection platform to assess balance, walking speed, manual dexterity, visual function, and cognition. The MSPT can be performed in a clinical setting, or by the MS patient themself in a home setting. Data can be transmitted from a distance and entered directly into a clinical or research database, potentially obviating the need for a clinic visit. This advantage is particularly important for individuals with disabling neurological conditions such as MS. Finally, because the MSPT is computer based, various analyses are feasible, unlike technician administered performance testing. This paper describes the design and initial application of the MSPT.
Development of the MSPT and initial application was approved by the Cleveland Clinic Institutional Review Board. MS patients and HCs signed approved Informed Consent Documents prior to testing MSPT.
1. General Aspects of Method Development
The tablet used for this protocol is the Apple iPad, a powerful computing device with high quality inertial sensors embedded within the device. These various sensors packaged in a compact, affordable device provide an ideal platform for multi-sensor test administration.
2. Design and Testing of the Multiple Sclerosis Performance Test (MSPT)
Prepare the MSPT on the iPad to include five performance modules as follows: 1) Walking Speed, designed to simulate the WST; 2) Balance Test; 3) Manual Dexterity Test (MDT) for upper extremity function, designed to simulate the 9HPT; 4) Processing Speed Test (PST), designed to simulate the SDMT; and 5) Low contrast letter acuity test (LCLA), designed to simulate standard Sloan LCLA charts12 (Table 1).
3. Validation
MSPT testing methodology is illustrated by Figures 1-4 and demonstrated in the video.
We tested 51 MS patients and 49 HC (Table 2). They were well-matched for age, gender, race, and years of education. Disease duration in the MS patients, defined as time from first MS symptom, was 12.1 (9.1) years; EDSS was 3.9 (1.8); 74.5% were using MS disease modifying drugs; 29.4% had progressive forms of MS; and 43% were employed fulltime.
Reproducibility was tested for all measures by having each research subject perform each test twice, both during a morning test session, and during a second test session in the afternoon following a 2-4 hr rest period. Test-retest reproducibility was analyzed by inspecting visual plots (Figure 5), and by generating Concordance Correlation Coefficients. The figure shows reproducibility data for the technician (panels labeled 1) and MSPT (panels labeled 2) testing for the dimensions of walking (Figure 5A), upper extremity dexterity (Figure 5B), vision (Figure 5C), and cognitive processing speed (Figure 5D). For all patients, correlation coefficients for the walking test were 0.982 for the technician, and 0.961 for the MSPT; for the manual dexterity dimension, correlation coefficients were 0.921 for the technician and 0.911 for the MSPT; for the vision dimension (2.5% contrast level) they were 0.905 for the technician, and 0.925 for the MSPT; and for the cognitive processing speed dimension, they were 0.853 for the technician and 0.867 for the MSPT. Reproducibility was similar for MS and HCs.
Concurrent validity was tested by comparing the technician and iPad based testing for each of the 4 dimensions using Pearson Correlation Coefficients. Data in Table 3 shows strong correlations. Correlation coefficients exceeded 0.8 for all tests, and in many cases, correlation coefficients exceeded 0.9. Correlations were strong both for the morning and afternoon test sessions, and for both MS and HC.
Table 4 shows the ability of each test to distinguish MS from HC. Data from the morning and afternoon test sessions is shown. All tests distinguished between the two groups, although the MSPT vision testing was borderline significant. Sensitivity in distinguishing MS from HC was quantified using Cohen’s d as the measure of effect size. Effect sizes of 0.8 or higher are considered strong effects. All tests showed good ability to distinguish MS from HC, and the tablet testing generally compared favorably to the technician testing.
Within the MS group, technician and tablet testing for all 4 disease dimensions correlated significantly with EDSS score and disease duration. For EDSS, the strongest correlations were with the walking tests (technician WST r = 0.67; tablet WST r = 0.67). Correlations between the other tests and EDSS ranged from -0.37 (technician SLCLA, 2.5%) to 0.53 (tablet MDT). Correlations with disease duration ranged from r = -0.34 (technician SLCLA, 2.5%) to r = -0.46 (technician SDMT).
Table 5 shows test scores for more severe compared with more mild MS (Table 5a – progressive forms of MS compared with relapsing forms of MS; Table 5b – longer disease duration compared with shorter disease duration; Table 5c – EDSS >=4.0 compared with EDSS <4.0). For each definition of disease severity, scores were significantly worse in the more severe MS group. Not surprisingly, the walking tests strongly separated progressive from relapsing patients, and high from low EDSS patients, since the definition of these categories is heavily dependent on walking ability. Cognitive processing speed testing correlated better with disease duration than with disease category, as expected. In most cases, the effects were quite strong, and MSPT testing performed as well or better than technician testing.
Table 6 shows correlations with patient reported outcomes from the MSPS. There were significant correlations between walking test scores, and patient self-reports on mobility; and between manual dexterity test scores and patient self-reports on hand function. There were no significant correlations between the vision or processing speed testing and patient reports of visual or cognitive problems. There were significant correlations between bladder and spasticity self-reports and test scores from all 4 dimensions and significant correlations between fatigue and test scores from 3 of the 4 dimensions.
Figure 6 shows research subject satisfaction with the iPad MSPT testing. Each subject was asked to rate their level of agreement with a series of questions, and the proportion of responses in each category were tabulated for each question. Research subjects responded to the following statements: 1) The instructions for the application were easy to understand (Figure 6A). 2) I am a frequent tablet or smart phone user (Figure 6B). 3) The applications were easy to see on the screen (Figure 6C). 4) Completing tasks on the tablet using the touch screen was easy (Figure 6D); 5) I had difficulty wearing the tablet during the walking and balancing tests (Figure 6E). 6) Completing these applications caused me to become fatigued (Figure 6F). Research subject acceptance of the iPad testing was generally favorable and comparable between MS and HC for 4 of the 6 statements. MS patients were less likely than HCs to state that understanding the instructions was easy, and more likely to state that using the tablet was fatiguing.
Figure 1. Research subject with tablet positioned at sacral level for walking and balance testing.
Figure 2. Manual dexterity test apparatus fitted to tablet.
Figure 3. Research subject testing low contrast letter acuity.
Figure 4. Processing Speed Test Screen Layout. Please click here to view a larger version of this figure.
Figure 5. Test-Retest Data for technician and tablet testing. Each panel shows test-retest data for each subject for the technician testing (labeled “1”) and the tablet testing (labeled “2”). HCs are closed black circles, and MS patients are closed red circles. Panel A shows test-retest data for the 25FW/WST; Panel B shows test-retest data for the 9HPT/MDT; Panel C shows test-retest data for the SLCLA/LCLAT; and Panel D shows test-retest data for the SDMT/PST. Reproducibility was high for both technician and tablet based tests. Concordance correlation coefficients for all subjects/HCs/MS: Technician WST: 0.982/0.917/0.981; tablet WST: 0.961/0.736/0.959, Technician 9HPT: 0.921/0.777/0.93; tablet MDT dish test: 0.911/0.749/0.910; Technician SLCLA: 0.905/0.883/0.905; tablet LCLAT: 0.925/0.874/0.944; Technician SDMT: 0.853/0.791/0.889; tablet PST: 0.867/0.865/0.831. Please click here to view a larger version of this figure.
Figure 6. Satisfaction Data with tablet MSPT. The figure shows level of agreement with the questions above each panel. For each of the 6 questions, HC responses are shown on the left, and MS responses on the right. The large majority of research subjects agreed that the instructions were easy to understand, the tablet applications easy to see, that completing the testing was easy, and disagreed that wearing the tablet for gait testing was difficult or that the testing was fatiguing. Response distribution was similar for HC and MS subjects for 4 of the 6 questions. MS patients were less likely to state that understanding the instructions was easy, and were more likely to find the testing fatiguing. Please click here to view a larger version of this figure.
Table 1. Dimensions of Interest and MSPT / MSFC Tests.
Dimension | MSPT Test | MSFC Test | Comment |
Lower extremity function | WST | 25FW | WST has been shown to correlate with EDSS, and patient self-reports |
Walking and standing stability | Balance Test | None | Imbalance is a common MS manifestation, but there are no practical balance tests for general use |
Hand coordination | MDT | 9HPT | 9HPT has been shown to be informative in clinical trials |
Cognitive processing speed | PST | SDMT | PASAT-3 was recommended for the initial version of the MSFC, but an expert panel has recommended that it be replace by SDMT |
Vision | LCLAT | SLCLA | SLCLA has been validated in MS patients and recommended for future versions of the MSFC |
Table 2. Patient and Healthy Control Characteristics.
Table 3. Pearson Correlation Coefficients Between iPad Tests and Analogous Technician Tests.
Table 4. Ability of Each Test to Distinguish Between MS and HC.
Table 5. Test outcomes for MS of varying disease progression states.
Table 5a. CIS+RR vs. SP.
Table 5b. DD <Median vs. DD ≥Median.
Table 5c. EDSS <Median (EDSS 4.0) vs. EDSS ≥Median (EDSS 4.0).
Table 6. Pearson Correlation with MSPS (Patients Reports) – Morning session data.
Note: Rows with gray color include data of Technician and rows with white color include data of iPad. Light green color correlations with p-value <0.01, light yellow correlations with p-value <0.05 are marked within the table. Cells showing correlations expected (25FW/WST vs mobility; 9HPT/MDT Dish vs hand coordination; SLCLA/LCLAT 2.5% vs vision; and SDMT/PST vs cognition) are highlighted with double cell borders. Please click here to view a larger version of this table
Multiple sclerosis outcome assessment methods range from biological measures of the disease process (e.g., inflammatory markers in blood or CSF) to patient reported outcomes (PROs) reflecting symptoms and feelings related to the disease. In between these extremes are imaging measures, many based on magnetic resonance imaging (MRI), clinician rated outcomes (Clin-ROs), and performance based outcomes (Perf-Os). They are extremely important for many reasons. They are used to rate the severity of clinical manifestations, track disease evolution over time, or assess response to therapy. In the regulatory environment, Clin-ROs and Perf-Os are used as the primary outcome measure for phase 3, registration trials. Importantly, Clin-ROs and Perf-Os are also used to categorize or measure disease severity for studies focused on pathogenesis. For these reasons, reproducible and validated clinical outcome measures are crucial to advance patient care and research.
Two fundamentally different approaches to MS clinical outcome measures are clinician rating scales and quantitative tests of neurological and neuropsychological performance. The most commonly used, and generally accepted rating scale used in the MS field is the Kurtzke EDSS. The most common quantitative preformance measure is the MSFC. The advantages of each of these approaches, and their shortcomings have been reviewed and debated. MSFC-type measures carry advantages in terms of precision and the quantitative nature of the data, but interpreting the meaning of small changes to the patient may be difficult. Nevertheless, efforts are underway to improve on the MSFC approach and to derive a more informative disability measure that could be qualified as a primary outcome measure for future trials in progressive MS populations.
MSFC testing has been included in most MS drug trials over the past 15 years, and components of the MSFC (particularly WST and 9HPT) are commonly used in clinical practice. This is empirical evidence of value in neuroperformance testing in both the clinical trial and practice settings. Given the tendency within the MS field to use MSFC testing, current efforts to further develop this approach15, and advances in information technology, we developed the MSPT.
In this report, we document high precision, strong correlations between MSPT component test results and the analogous technician-based testing, and favorable sensitivity in distinguishing MS from controls, and mild from severe MS. Also, we document significant correlation between patient reports and MSPT testing of walking and hand function. In all comparisons, MSPT testing compared favorably to technician-based testing. Finally, we document high test subject acceptance of the iPad based testing.
There are implications of this work. First, conducting neurological performance testing within the computer environment enables various direct manipulations and analyses of primary and derivative data. An example is the Symbol Digit Modalities Test. For this technician administered test, the number of correct answers in 90 sec is recorded manually on a case report form, and transferred to a research database. For each research subject, one number is returned for each test session. Using a computer-based analogue of the SDMT as a measure of cognitive processing speed, the analysis program can easily and instantaneously determine the number correct for each 30 sec interval and can generate within-test-session slopes based on each 30 sec interval. These slopes might represent learning ability (e.g., improving slope) or cognitive fatigue (e.g., worsening slope). These exploratory parameters may correlate with dimensions of the disease not captured by the limited information available from technician testing. Thus the context of testing results in greater information content, even though the testing itself may require similar effort on the part of the research subject.
Secondly, results can be directly transmitted to research or clinical data repositories without paper or electronic case report forms. This would substantially reduce the need for manual data quality checks, the cost of transcribing data manually, and would reduce human error. In aggregate, these advantages should translate into improved efficiency and data quality.
Third, computer based testing could be widely disseminated to patients who do not reside near a clinical trial performance site. Patients could be tested using the MSPT in a rural doctor’s office, potentially supporting participation in clinical trials for patients who might otherwise not be able to participate in a clinical trial simply because of distance.
Third, the MSPT could be used in practice settings (e.g., MS clinics) to collect standardized neuroperformance information. Because the data is standardized and quantitative, MSPT could provide a highly cost-efficient mechanism to collect MS assessment data during routine clinical practice. This could populate research registries and inform practice-based research related to natural history, treatment, effects of co-morbidities, and various other important topics.
Finally, the computer-based MSPT described in this paper could be adapted to in-home testing. This could be transformative, since data could be collected in the same location as the research participant’s (or clinical patient’s) normal environment, as opposed to the highly artificial circumstances in most clinical trial or patient care settings. In addition to substantially lowering barriers imposed by travel to a clinical trial site or academic center, this feature would provide data on neurological function in a real world setting. Further, multiple measurements over defined time periods could be collected, allowing a more precise assessment of overall neurological performance and identification of relevant fluctuations (e.g., fatigability over the course of the day, or significant deviation from individual average performance). This, in turn, would make functional testing much more patient-relevant, and more informative to clinicians and researchers.
It is important to note that the iPad is used as a platform to host the data collection and processing algorithms in the software. Much like other computerized testing approaches, the software was written in such a manner that should Apple or other tablet makers update the hardware or operating system, adjustments can be made in the acquisition and processing of the data to ensure that outcomes remain consistent across test modules and do not have to be re-validated under the future device or software configurations.
As neuroperformance testing is increasingly applied in MS and other chronic neurological and neuropsychological disorders, computer-adapted testing will have the same transformative effect on clinical care and research as standardized computer-adapted testing has had in the education field, with clear potential to accelerate progress in clinical care and research for neurological disorders.
The authors have nothing to disclose.
The authors gratefully acknowledge research funding for validation of the MSPT from Novartis Pharmaceutical Corporation, East Hanover, New Jersey.
Name of Material/ Equipment | Company | Catalog Number | Comments/Description |
9-Hole Peg Test Kit | Rolyan | A8515 | |
Apple iPad with Retina Display (16GB, Wi-Fi, White) | Apple | MD513LL/A | |
CD Player | Non-brand specific | ||
iPad Body Belt | Motion Med LLC | RMBB001 | Special order for The Cleveland Clinic |
LCVA Wall Chart | Precision Vision | 2180 | |
Music Stand | Non-brand specific | ||
PASAT Audio CD | PASAT.US | English | |
SDMT Test Materials | WPS | W-129 | |
Upper Extremity Overlay Apparatus | Motion Med LLC | PB002 | Special order for The Cleveland Clinic |