We describe how to implement a battery of behavioral tasks to examine the processing and integration of sensory stimuli in children with ASD. The goal is to characterize individual differences in temporal processing of simple auditory and visual stimuli and relate these to higher order perceptual skills like speech perception.
In addition to impairments in social communication and the presence of restricted interests and repetitive behaviors, deficits in sensory processing are now recognized as a core symptom in autism spectrum disorder (ASD). Our ability to perceive and interact with the external world is rooted in sensory processing. For example, listening to a conversation entails processing the auditory cues coming from the speaker (speech content, prosody, syntax) as well as the associated visual information (facial expressions, gestures). Collectively, the “integration” of these multisensory (i.e., combined audiovisual) pieces of information results in better comprehension. Such multisensory integration has been shown to be strongly dependent upon the temporal relationship of the paired stimuli. Thus, stimuli that occur in close temporal proximity are highly likely to result in behavioral and perceptual benefits – gains believed to be reflective of the perceptual system's judgment of the likelihood that these two stimuli came from the same source. Changes in this temporal integration are expected to strongly alter perceptual processes, and are likely to diminish the ability to accurately perceive and interact with our world. Here, a battery of tasks designed to characterize various aspects of sensory and multisensory temporal processing in children with ASD is described. In addition to its utility in autism, this battery has great potential for characterizing changes in sensory function in other clinical populations, as well as being used to examine changes in these processes across the lifespan.
Traditional neuroscience research has often approached understanding sensory perception by focusing on the individual sensory modalities. However, the environment consists of a wide array of sensory inputs that are integrated into a unified perceptual view of the world in a seemingly effortless manner. The fact that we exist in such a rich multisensory environment requires that we better understand the way in which the brain combines information across the different sensory systems. The need for this understanding is further amplified by the fact that the presence of multiple pieces of sensory information often results in substantial improvements in behavior and perception1-3. For example, there is a large improvement (up to 15 dB in the signal-to-noise ratio) in the ability to understand speech in a noisy environment if the observer can also see the speaker’s lip movements4-7.
One of the major factors that affects how the different sensory inputs are combined and integrated is their relative temporal proximity. If two sensory cues occur close together in time, a temporal structure that suggests common origin, they are highly likely to be integrated as evidenced by changes in behavior and perception8-12. One of the most powerful experimental tools for examining the impact of multisensory temporal structure on behavioral and perceptual responses is simultaneity judgment (SJ) tasks13-16. In such a task, multisensory (e.g., visual and auditory) stimuli are paired at various stimulus onset asynchronies (SOAs) ranging from objectively simultaneous (i.e., a temporal offset of 0 msec) to highly asynchronous (e.g., 400 msec). Participants are asked to judge the stimuli as simultaneous or not via a simple button press. In such a task, even when the visual and auditory stimuli are presented at SOAs of 100 msec or more, subjects report that the pair was simultaneous on a large proportion of trials. The window of time in which two inputs can occur and have a high probability of being perceived as occurring simultaneously is known as the temporal binding window (TBW)17-19.
The TBW is a highly ethological construct, in that it represents the statistical regularities of the world around us19. The “window” provides flexibility for the specification of events of common origin; one that allows for stimuli occurring at different distances with different propagation times (both physical and neural) to still be “bound” to one another. However, although the TBW is a probabilistic construct, changes that expand (or contract) the size of this window are likely to have cascading and potentially detrimental effects on perception20,21.
Autism spectrum disorder (ASD) is a neurodevelopmental disorder that has been classically diagnosed on the basis of deficits in social communication and the presence of restricted interests and repetitive behaviors22. In addition, and as recently codified in the DSM-5, children with ASD frequently exhibit alterations in their responses to sensory stimuli. Rather than being restricted to a single sense, these deficits often encompass multiple senses including hearing, touch, balance, taste and vision. Along with such a “multisensory” presentation, individuals with ASD often exhibit deficits in the temporal realm. Collectively, these observations suggest that multisensory temporal function may be preferentially altered in autism17,23-25. Although concordant with the view of altered sensory function in ASD, changes in multisensory temporal function may also be an important contributor to the deficits in social communication in ASD, given the importance of rapid and accurate binding of multisensory stimuli for social and communication functions. Take as an example the speech exchange described above in which important information is contained in both the auditory and visual modalities. Indeed, these tasks have been used to demonstrate significant differences in the width of the multisensory TBW in high functioning children with autism26-28.
Due to its importance for normal perceptual function, its potential implications for higher order processes such as social communication (and other cognitive abilities), and its clinical relevance, a battery of tasks designed to assess multisensory temporal function in children with ASD is described.
Ethics statement: All subjects must provide informed consent prior to the experiment. The research described here has been approved by the Vanderbilt University Medical Center’s Institutional Review Board.
1. Experiment Set Up
2. Stimuli
3. Task Battery
NOTE: This task requires that all participants are able to understand and comply with verbal instructions from the experimenter.
4. Simultaneity Judgment (SJ)
NOTE: The SJ task is a two alternative forced-choice task (2-AFC) and consists of a visual ring and 1,000 Hz auditory tone presented at various SOAs (negative = auditory preceding visual, positive = visual proceeding auditory) presented in random order.
5. Temporal Order Judgment (TOJ)
NOTE: The auditory TOJ task is a 2-AFC task used to examine the temporal acuity of auditory processing. The visual TOJ task is a 2-AFC task used to examine the temporal acuity of visual processing. The multisensory TOJ task is a 2-AFC task used to examine temporal acuity across audition and vision. Each task takes approximately 10 – 15 min to complete.
6. McGurk Task
NOTE: The McGurk illusion consists of a video of the visual syllable “ga” paired with an auditory recording of the syllable “ba”. Many subjects will actually fuse the visual and auditory syllables and perceive this pair as the syllable “da” or “tha”32.
This task battery has proven very successful in measuring individual differences in temporal processing in individuals with and without ASD17,18,23,27. For the SJ task, plot the resulting data from each individual subject by first calculating the proportion of responses at each SOA that subject responded “synchronous” and then fitting the resulting response curve with a Gaussian curve. As illustrated in Figure 1A, there is a window of time in which visual-auditory stimuli pairs can be presented with a delay and will be perceived as synchronous on a high proportion of trials. The width of the “left” (covering auditory-first asynchronies) and “right” (covering visual-first asynchronies) side of the TBW is measured by calculating the width of the window from 0 ms to the SOA on each side that corresponds to 50% synchronous responses (dashed lines, Figure 1A). A robust finding across both TD participants and clinical populations is that the right TBW (visual first) is typically wider than the left (auditory first) TBW. Participants with ASD also show a wider TBW than their TD counterparts (Figure 1B).
For the TOJ tasks, the data from each individual subject is first plotted by calculating the proportion of responses at each SOA that the “positive” stimulus was perceived as being presented first (higher tone, bottom circle, visual flash), and the resulting response curves from each task are fit with a cumulative Gaussian curve. Example TOJ curves from a single TD subject are shown in Figure 2. While performance on the unisensory TOJ tasks is highly accurate for all but the smallest SOAs (2A and 2B), determining temporal order across modalities is much more difficult, as indexed by a much more shallow curve (2C) and decreased accuracy (2D) for the multisensory TOJ task. The point of subjective simultaneity (PSS) for each subject is measured by calculating the SOA at which subjects perform at chance (see dashed line, Figure 2A-C). Perform a t-test to determine if there are differences between groups. To compare performance across tasks or across subjects, calculate accuracy at each SOA and plot as a function of the delay between the stimulus pair (collapsing across the positive and negative SOA at each delay; see Figure 2D). Some studies examining sensory processing in ASD have found differences in TOJ tasks between ASD and TD groups23,34, while others have not observed significant differences between groups27. The reason for these discrepancies is unclear, although high heterogeneity across individuals with ASD35 and slight differences in task structure across these studies may play a role.
McGurk perception is analyzed by calculating the proportion of trials that the participant perceived the fused percept “da” compared to the total number of trials presented. Example results from the McGurk task are shown for an ASD and TD subject group in Figure 3A. Even within the same individual, responses to the stimulus can often vary from trial-to-trial, therefore it is useful to consider the distribution of these responses. There is currently some debate in the literature about differences in multisensory integration as indexed by McGurk perception. Some groups have found that TD subjects have increased McGurk perception compared with ASD subjects27,36, while others have found that ASD subjects had higher McGurk perception37. Some of these discrepancies may be explained by differences in the McGurk stimulus used in each study. Some McGurk stimuli are “stronger” than others (i.e., they are more likely to elicit the illusory McGurk percept on a high proportion of trials for a subject) than others, which can be quantified by a recent model of variability McGurk perception38. As an example of the utility of this battery, individual differences in temporal processing (such as the width of the TBW) can be correlated with performance differences on a perceptual task like the McGurk illusion (Figure 3B). Several studies have observed a link between temporal acuity in the SJ task and perceptual differences in speech perception in the McGurk task and other measures of multisensory integration18,27.
Figure 1. Simultaneity Judgment (SJ) results. Representative data from the Simultaneity Judgment (SJ) task for a single ASD subject (age = 8) and a single TD subject (age = 9). (A) Raw data from the SJ task for a single ASD subject is shown in black. The fitted Gaussian curve is shown in blue. The blue dashed lines show width of the left and right TBW (227 ms and 333 ms, respectively) for this individual subject. (B) Fitted TBW curves for the same ASD subject in blue and a single TD subject is shown in red. The TD subject has a smaller TBW (left TBW = 166 msec, right TBW = 196 msec) than the ASD subject. Please click here to view a larger version of this figure.
Figure 2. Temporal Order Judgment (TOJ) results. Representative data from the Temporal Order Judgment (TOJ) tasks from a single TD subject (age = 15). (A) Raw data and fitted curve for auditory TOJ task. Data are plotted as a function of lower pitch first responses across the different SOAs (negative SOAs indicate higher pitch came first, positive SOAs indicate lower pitch came first). (B) Raw data and fitted curve for visual TOJ task. Data are plotted as a function of bottom circle first responses across the SOAs (negative SOAs indicate top circle came first, positive SOAs indicate bottom circle came first). (C) Raw data and fitted curve for multisensory TOJ. Data are plotted as a function of visual flash first responses across the SOAs (negative indicates auditory beep came first, positive SOAs indicate visual flash came first). (D) Same data from A-C plotted as the average accuracy (correct identification of temporal order) at each delay (collapsed across the negative and positive SOA). Please click here to view a larger version of this figure.
Figure 3. McGurk Task results and comparison of McGurk performance with Simultaneity Judgment performance. Representative data from the McGurk task with ASD and TD subject groups, adapted with permission from27. (A) Responses to the McGurk stimulus for TD (shown in black) and ASD (shown in red) subjects. Because of the variability of responses for the same stimulus both within individual subjects and across subjects in a group, responses are shown as the percent of trials that were perceived as each phoneme. ASD subjects heard the auditory syllable “ba” on a larger percentage of trials than TD subjects, while the TD subjects heard the fused audiovisual syllable “da” on a larger percentage of trials than ASD subjects. (B) Correlation between the width of the temporal binding window (TBW) from the SJ task and the proportion of trials in which the fused audiovisual syllable “da” was perceived from the McGurk stimulus in the same group of ASD subjects. There was a significant negative correlation where the low McGurk perception was correlated with a larger TBW (r = 0.46, p < 0.05). Please click here to view a larger version of this figure.
The manuscript describes elements of a psychophysical task battery that are used to assess temporal processing and acuity in sensory and multisensory systems research. The battery has wide applicability for a number of populations and has been used by our laboratory in order to characterize audiovisual temporal performance in typical adults18, children10,39, and in children and adults with autism17,23. In addition, it has been used to examine how various facets of the battery relate to one another in correlational analyses27, and is currently being used to relate sensory and multisensory performance measures to cognitive domains including language and communication, attention and executive function. It is important to note that the main limitation of this task battery with regards to testing individuals with ASD is that the format of the tasks requires that participants have the receptive language skills to understand verbal instructions and indicate this understanding. As such, the task battery is currently only suitable for testing high-functioning individuals with ASD.
The emphasis of the battery on temporal factors is grounded in the importance of these factors for the construction of veridical sensory and perceptual representations. In the multisensory realm, this is best captured in the construct of a multisensory “temporal binding window (TBW),” the epoch of time in which auditory and visual cues can strongly influence one another. As previously suggested, this window is a highly ecological construct, in that sensory events and their associated energies happen at different distances. Thus, accounting for the differences in propagation times of the auditory and visual signals, the brain assesses audiovisual temporal structure in relation to this window, and thus makes a probabilistic judgment as to whether the stimuli belong together or not. These data strongly argue for the TBW as a measure of temporal acuity and strength of multisensory integration, and indeed it has been shown that the width of this window appears to be correlated with the magnitude of the binding process, with those with smaller windows having larger indices of integration18,27.
In addition to be a probabilistic construct across individuals, the TBW is also very much dependent on stimulus and task. Indeed, as highlighted in the battery presented here, multisensory temporal function can be assessed using stimuli ranging from the very simple and non-ecological (e.g., flashes and beeps) to the most ethologically relevant of audiovisual signals (i.e., speech). In addition, the TBW can be derived from measures including simultaneity judgments, temporal order judgments, perception of illusory stimuli, etc. Hence, the collective use of tasks that differ in both their stimulus and task contingencies provide the most comprehensive window into audiovisual temporal function.
An individual’s TBW is measured by extracting parameters from a curve fit to the participant’s raw performance from the SJ task. Therefore, care should be taken to examine individual subjects’ curve fits to ensure that the fitted curve accurately describes the raw data. Although an array of definitions for measuring the width of the TBW exists in the literature, it is suggested that the following criteria be used to easily compare across subjects while still capturing individual differences in performance. First, the “left” and “right” TBW should be measured from 0 msec (objectively auditory leading asynchrony vs. visual leading asynchrony) as opposed to the individual PSS (the mean of the fitted curve). Secondly, the width should be measured at 50% report of synchronous trials (not 50% of the maximum response for that subject), capturing the range of asynchronies in which a subject reported “same time” for a majority of trials. Because some subjects never report “same time” for more than 75% of the trials on any SOA, this will allow the greatest number of subjects to be included in the analysis.
Along with its utility in characterizing multisensory temporal function in “neurotypical” populations across the lifespan, elements of the described task battery have been used to assess sensory and multisensory processes in individuals with ASD26-28,37. Although sensory disturbances have been classically associated with autism, it is only recently that these disturbances have entered the diagnostic vernacular, and that a stronger appreciation of how altered multisensory function may contribute to the autism phenotype has been gained. Indeed, the core impacted domains in autism (i.e., social communication) are representations that are built on the basis of multisensory processes, strongly suggesting that alterations in these processes could have detrimental effects on social communication. Using elements of the temporal battery described here, it has been established that multisensory temporal acuity is poorer in autism, and that this poorer performance is related to speech comprehension measures28. Ongoing work is seeking to relate various aspects of audiovisual temporal performance to a host of cognitive measures.
The authors have nothing to disclose.
This research was supported by NIH R21CA183492, the Simons Foundation, the Wallace Research Foundation, and by CTSA award UL1TR000445 from the National Center for Advancing Translational Sciences.
Oscilloscope | |||
Photovoltaic cell | |||
Microphone | |||
Noise-cancelling headphones | |||
Chin rest | |||
Audiometer |