Source: Laboratories of Jonas T. Kaplan and Sarah I. Gimbel—University of Southern California
Imagine the sound of a bell ringing. What is happening in the brain when we conjure up a sound like this in the “mind’s ear?” There is growing evidence that the brain uses the same mechanisms for imagination that it uses for perception.1 For example, when imagining visual images, the visual cortex becomes activated, and when imagining sounds, the auditory cortex is engaged. However, to what extent are these activations of sensory cortices specific to the content of our imaginations?
One technique that can help to answer this question is multivoxel pattern analysis (MPVA), in which functional brain images are analyzed using machine-learning techniques.2-3 In an MPVA experiment, we train a machine-learning algorithm to distinguish among the various patterns of activity evoked by different stimuli. For example, we might ask if imagining the sound of a bell produces different patterns of activity in auditory cortex compared with imagining the sound of a chainsaw, or the sound of a violin. If our classifier learns to tell apart the brain activity patterns produced by these three stimuli, then we can conclude that the auditory cortex is activated in a distinct way by each stimulus. One way to think of this kind of experiment is that instead of asking a question simply about the activity of a brain region, we ask a question about the information content of that region.
In this experiment, based on Meyer et al., 2010,4 we will cue participants to imagine several sounds by presenting them with silent videos that are likely to evoke auditory imagery. Since we are interested in measuring the subtle patterns evoked by imagination in auditory cortex, it is preferable if the stimuli are presented in complete silence, without interference from the loud noises made by the fMRI scanner. To achieve this, we will use a special kind of functional MRI sequence known as sparse temporal sampling. In this approach, a single fMRI volume is acquired 4-5 s after each stimulus, timed to capture the peak of the hemodynamic response.
1. Participant recruitment
2. Pre-scan procedures
3. Provide instructions for the participant.
4. Put the participant in the scanner.
5. Data collection
6. Data analysis
Figure 1: Region of interest tracing. The surface of the planum temporale has been traced on this participant's high-resolution anatomical image, and is shown here in blue. In green is the control mask of the frontal pole. These voxels will be used for MVPA analysis.
Auditory imagery is a process that gives rise to the experience of hearing sounds, even when there are no external auditory stimuli present.
For instance, think about hearing the sound of a cell phone ringing. While information within memory underlies this imaginary event, evidence suggests that an individual’s brain uses the same mechanisms for imagination as those that are involved in the actual perception.
Just upon imagining the ringing, regions within the auditory cortex become activated. However, even though this is true across acoustic stimuli, how sounds are encoded to allow for the detailed processing of distinct sounds—like distinguishing between a doorbell chime and a song playing on the radio—is an important question.
Based on previous work by Meyer and colleagues, this video demonstrates how to combine functional magnetic resonance imaging—fMRI—with presentations of different silent videos to investigate how the brain responds to auditory imagery.
We will also describe how to use a method called multi-voxel pattern analysis—MVPA for short—to predict what subjects have imagined by analyzing patterns of activation obtained during fMRI sessions.
In this experiment, participants lie in an fMRI scanner and are shown a series of silent videos. Each one—whether it’s a rooster crowing, a chainsaw cutting through a tree, or a person playing a piano—evokes distinctive and vivid auditory imagery, and they are asked to imagine the sounds during every single presentation.
The imaging acquisition procedure relies upon sparse temporal sampling, whereby a single fMRI volume is acquired 4 to 5 s after each stimulus is presented. Such timing captures the peak of the hemodynamic response and reduces the likelihood that signals would be masked by scanner noise.
Each imagined sound is expected to induce subtle yet distinctive patterns of neural activity, specifically in the auditory cortex. Here, pattern is the key word: The classical way to analyze these data uses a univariate approach, where the individual voxels—representing some level of activation—are collapsed into a single average.
These values are then compared across sounds and may not produce any significant differences in activation levels.
Instead, using a multivariate analysis, multiple voxels are laid out for each sound and activation levels can be compared collectively, across all voxels—contributing to a unique overall pattern for each imagined sound.
With this multi-voxel pattern analysis, or MVPA, approach, if the patterns are indeed sensitive to specific content, then it’s possible that they could be used to predict the original stimulus. That’s right—MVPA is often referred to as a mind-reading technique!
To achieve this prediction aspect, more intensive processing must be performed after collecting the participants’ data, which are divided into training and testing sets.
Labeled data from the training set are first subjected to machine-learning calculations, specifically, a Support Vector Machine algorithm. This process is used to accurately classify data by recognizing features in the neural patterns that may distinguish the three types of sounds from one another.
After the classifier has learned features to accurately identify the types, it’s presented with unlabeled data from the testing set, and its guesses are then compared to the correct stimuli labels.
In this case, the classification performance serves as the dependent variable—recorded as the classifier’s accuracy—which is also compared to voxels evoked in a different location in the brain, such as the frontal pole.
The classifier is expected to predict the identification of auditory imagery, revealing the importance of MVPA in detecting content-specific activity within the auditory cortex.
Per experimental and safety concerns, verify that all participants are right-handed, have normal or corrected-to-normal vision, no history of neurological disorders or claustrophobia, and do not possess any metal in their body. Also, ensure that they fill out the necessary consent forms.
Before proceeding, explain that they will see several short silent videos in the scanner that may evoke a sound in their mind. Ask them to focus on the imagined sounds, “hear” them as best as they can, and to remain still for the duration of the task.
Now, prepare the participant to enter the scanner. To see these steps in detail, please refer to another fMRI video produced in this collection.
Following preparation, align the participant and send them inside the bore. In the adjacent room, first collect a high-resolution anatomical scan. Then, synchronize the start of the silent video presentation with the start of the functional scan.
To achieve sparse temporal sampling, set the acquisition time of an MRI volume to 2 s, with a 9-s delay in between.
Importantly, coordinate the start of each 5-s video clip to begin 4 s after the previous MRI acquisition starts to capture the hemodynamic activity that corresponds to the middle of the movie.
Present each video 10 times, in random order, generating one scanning session that lasts 5.5 min. Repeat this functional acquisition sequence three more times.
After the four functional scans have been performed, bring the participant out of the scanner, and debrief them to conclude the study.
To define regions of interest, use the high-resolution anatomical scans of each participant and trace voxels on the surface of the temporal lobe that correspond to the early auditory cortex, also known as the planum temporale. In addition, create a mask containing voxels in the frontal lobe, which will be used as the control region.
Then, preprocess the data by performing motion correction to reduce movement artifacts and temporal filtering to remove signal drifts.
Next, divide the data into two sets: training and testing. In one data set, train a classifier—a support vector machine algorithm—making sure to keep data from the two brain regions separate for each subject.
In the other set, assess what the classifier has learned—its ability to correctly guess the identity of unlabeled data—and record the algorithm’s accuracy across runs. Perform this procedure a total of four times, leaving out one functional scan as testing data each time—a process called cross-validation.
To visualize the data, graph the averaged classifier accuracies across the four cross-validation folds for each participant.
Plot these averages for both the primary region of interest—the planum temporale—and the control area—the frontal pole—to compare focal specificity of the classifier, the extent to which a particular area, such as the auditory cortex, is selectively predicted to be involved in auditory imagination.
In this case, run a non-parametric statistic, the Wilcoxon Signed-Rank test, to test performance against chance, which is 33%. Note that the average classifier accuracy in the auditory cortex was 59%, which is significantly different from chance level.
In contrast, the mean performance in the frontal pole mask was 33%, which is not significantly different from chance.
Moreover, notice that classifier performance varied across individuals. After using a permutation test to calculate a new statistical threshold of 42%, see that 19 of 20 subjects had accuracy values significantly greater than this level using voxels from the planum temporale, whereas none had performance greater than chance using voxels from the frontal pole.
Overall, these results imply that MVPA techniques accurately predicted which of the three sounds participants were imagining based on patterns of neural activity. Such predictions were only made within the auditory cortex, suggesting that acoustic content is not represented globally throughout the brain.
Now that you are familiar with how to apply multi-voxel pattern analysis to study auditory imagery, let’s look at how neuropsychologists use multivariate techniques to advance a futuristic approach to mind-reading—the decoding of mental states—in other domains.
Classifiers have been used on fMRI data obtained from the ventral temporal cortex to predict the kinds of objects participants viewed, distinguishing between houses and faces, for example.
Taking this a step further, it’s even possible to predict whether the individual would buy that house or find the person pleasant. As creepy as that sounds, these neuromarketing implications are not far-fetched!
The same approach could also be used to detect either emotional states after watching a show—recognizing that a scary film is indeed terrifying—or even the movie genre; for instance, the frightening flick might engage the amygdala more predictably over a contemplative one that would reliably involve the prefrontal cortex.
In addition, brain-computer interfaces could convert mental states into signals that would enhance communication, in the case of individuals undergoing speech therapy, or movements, for those who’ve suffered from amputation of a limb.
You’ve just watched JoVE’s video on understanding auditory imagery using multi-voxel pattern analysis. Now you should have a good understanding of how to design and conduct the auditory imagery experiment in conjunction with functional neuroimaging, and finally how to analyze and interpret specific patterns of brain activity.
Thanks for watching!
The average classifier accuracy in the planum temporale across all 20 participants was 59%. According to the Wilcoxon Signed-Rank test, this is significantly different from chance level of 33%. The mean performance in the frontal pole mask was 32.5%, which is not greater than chance (Figure 2).
Figure 2. Classification performance in each participant. For three-way classification, chance performance is 33%. According to a permutation test, the alpha level of p < 0.05 corresponds to 42%.
The permutation test found that only 5% of the permutations achieved accuracy greater than 42%; thus, our statistical threshold for individual subjects is 42%. Nineteen of the 20 subjects had classifier performance significantly greater than chance using voxels from the planum temporale, while none had performance greater than chance using voxels from the frontal pole.
Thus, we are able to successfully predict from patterns of activity in auditory cortex which of the three sounds the participant was imagining. We were not able to make this prediction based on activity patterns from the frontal pole, suggesting that the information is not global throughout the brain.
MVPA is a useful tool for understanding how the brain represents information. Instead of considering the time-course of each voxel separately as in a traditional activation analysis, this technique considers patterns across many voxels at once, offering increased sensitivity compared with univariate techniques. Often a multivariate analysis uncovers differences where a univariate technique is not able to. In this case, we learned something about the mechanisms of mental imagery by probing the information content in a specific area of the brain, the auditory cortex. The content-specific nature of these activation patterns would be difficult to test with univariate approaches.
There are additional benefits that come from direction of inference in this kind of analysis. In MVPA we start with patterns of brain activity and attempt to infer something about the mental state of the participant. This kind of “brain-reading” approach can lead to the development of brain-computer interfaces, and may allow new opportunities for communication with those with impaired speech or movement.
Auditory imagery is a process that gives rise to the experience of hearing sounds, even when there are no external auditory stimuli present.
For instance, think about hearing the sound of a cell phone ringing. While information within memory underlies this imaginary event, evidence suggests that an individual’s brain uses the same mechanisms for imagination as those that are involved in the actual perception.
Just upon imagining the ringing, regions within the auditory cortex become activated. However, even though this is true across acoustic stimuli, how sounds are encoded to allow for the detailed processing of distinct sounds—like distinguishing between a doorbell chime and a song playing on the radio—is an important question.
Based on previous work by Meyer and colleagues, this video demonstrates how to combine functional magnetic resonance imaging—fMRI—with presentations of different silent videos to investigate how the brain responds to auditory imagery.
We will also describe how to use a method called multi-voxel pattern analysis—MVPA for short—to predict what subjects have imagined by analyzing patterns of activation obtained during fMRI sessions.
In this experiment, participants lie in an fMRI scanner and are shown a series of silent videos. Each one—whether it’s a rooster crowing, a chainsaw cutting through a tree, or a person playing a piano—evokes distinctive and vivid auditory imagery, and they are asked to imagine the sounds during every single presentation.
The imaging acquisition procedure relies upon sparse temporal sampling, whereby a single fMRI volume is acquired 4 to 5 s after each stimulus is presented. Such timing captures the peak of the hemodynamic response and reduces the likelihood that signals would be masked by scanner noise.
Each imagined sound is expected to induce subtle yet distinctive patterns of neural activity, specifically in the auditory cortex. Here, pattern is the key word: The classical way to analyze these data uses a univariate approach, where the individual voxels—representing some level of activation—are collapsed into a single average.
These values are then compared across sounds and may not produce any significant differences in activation levels.
Instead, using a multivariate analysis, multiple voxels are laid out for each sound and activation levels can be compared collectively, across all voxels—contributing to a unique overall pattern for each imagined sound.
With this multi-voxel pattern analysis, or MVPA, approach, if the patterns are indeed sensitive to specific content, then it’s possible that they could be used to predict the original stimulus. That’s right—MVPA is often referred to as a mind-reading technique!
To achieve this prediction aspect, more intensive processing must be performed after collecting the participants’ data, which are divided into training and testing sets.
Labeled data from the training set are first subjected to machine-learning calculations, specifically, a Support Vector Machine algorithm. This process is used to accurately classify data by recognizing features in the neural patterns that may distinguish the three types of sounds from one another.
After the classifier has learned features to accurately identify the types, it’s presented with unlabeled data from the testing set, and its guesses are then compared to the correct stimuli labels.
In this case, the classification performance serves as the dependent variable—recorded as the classifier’s accuracy—which is also compared to voxels evoked in a different location in the brain, such as the frontal pole.
The classifier is expected to predict the identification of auditory imagery, revealing the importance of MVPA in detecting content-specific activity within the auditory cortex.
Per experimental and safety concerns, verify that all participants are right-handed, have normal or corrected-to-normal vision, no history of neurological disorders or claustrophobia, and do not possess any metal in their body. Also, ensure that they fill out the necessary consent forms.
Before proceeding, explain that they will see several short silent videos in the scanner that may evoke a sound in their mind. Ask them to focus on the imagined sounds, “hear” them as best as they can, and to remain still for the duration of the task.
Now, prepare the participant to enter the scanner. To see these steps in detail, please refer to another fMRI video produced in this collection.
Following preparation, align the participant and send them inside the bore. In the adjacent room, first collect a high-resolution anatomical scan. Then, synchronize the start of the silent video presentation with the start of the functional scan.
To achieve sparse temporal sampling, set the acquisition time of an MRI volume to 2 s, with a 9-s delay in between.
Importantly, coordinate the start of each 5-s video clip to begin 4 s after the previous MRI acquisition starts to capture the hemodynamic activity that corresponds to the middle of the movie.
Present each video 10 times, in random order, generating one scanning session that lasts 5.5 min. Repeat this functional acquisition sequence three more times.
After the four functional scans have been performed, bring the participant out of the scanner, and debrief them to conclude the study.
To define regions of interest, use the high-resolution anatomical scans of each participant and trace voxels on the surface of the temporal lobe that correspond to the early auditory cortex, also known as the planum temporale. In addition, create a mask containing voxels in the frontal lobe, which will be used as the control region.
Then, preprocess the data by performing motion correction to reduce movement artifacts and temporal filtering to remove signal drifts.
Next, divide the data into two sets: training and testing. In one data set, train a classifier—a support vector machine algorithm—making sure to keep data from the two brain regions separate for each subject.
In the other set, assess what the classifier has learned—its ability to correctly guess the identity of unlabeled data—and record the algorithm’s accuracy across runs. Perform this procedure a total of four times, leaving out one functional scan as testing data each time—a process called cross-validation.
To visualize the data, graph the averaged classifier accuracies across the four cross-validation folds for each participant.
Plot these averages for both the primary region of interest—the planum temporale—and the control area—the frontal pole—to compare focal specificity of the classifier, the extent to which a particular area, such as the auditory cortex, is selectively predicted to be involved in auditory imagination.
In this case, run a non-parametric statistic, the Wilcoxon Signed-Rank test, to test performance against chance, which is 33%. Note that the average classifier accuracy in the auditory cortex was 59%, which is significantly different from chance level.
In contrast, the mean performance in the frontal pole mask was 33%, which is not significantly different from chance.
Moreover, notice that classifier performance varied across individuals. After using a permutation test to calculate a new statistical threshold of 42%, see that 19 of 20 subjects had accuracy values significantly greater than this level using voxels from the planum temporale, whereas none had performance greater than chance using voxels from the frontal pole.
Overall, these results imply that MVPA techniques accurately predicted which of the three sounds participants were imagining based on patterns of neural activity. Such predictions were only made within the auditory cortex, suggesting that acoustic content is not represented globally throughout the brain.
Now that you are familiar with how to apply multi-voxel pattern analysis to study auditory imagery, let’s look at how neuropsychologists use multivariate techniques to advance a futuristic approach to mind-reading—the decoding of mental states—in other domains.
Classifiers have been used on fMRI data obtained from the ventral temporal cortex to predict the kinds of objects participants viewed, distinguishing between houses and faces, for example.
Taking this a step further, it’s even possible to predict whether the individual would buy that house or find the person pleasant. As creepy as that sounds, these neuromarketing implications are not far-fetched!
The same approach could also be used to detect either emotional states after watching a show—recognizing that a scary film is indeed terrifying—or even the movie genre; for instance, the frightening flick might engage the amygdala more predictably over a contemplative one that would reliably involve the prefrontal cortex.
In addition, brain-computer interfaces could convert mental states into signals that would enhance communication, in the case of individuals undergoing speech therapy, or movements, for those who’ve suffered from amputation of a limb.
You’ve just watched JoVE’s video on understanding auditory imagery using multi-voxel pattern analysis. Now you should have a good understanding of how to design and conduct the auditory imagery experiment in conjunction with functional neuroimaging, and finally how to analyze and interpret specific patterns of brain activity.
Thanks for watching!