To study the evolution of language, comparing brain mechanisms in humans with those in nonhuman primates is important. We developed a method to noninvasively measure the electroencephalography (EEG) of awake animals. It allows us to directly compare EEG data between humans and animals for the long term without harming them.
Vocal communication plays a crucial role in the social interactions of primates, particularly in survival and social organization. Humans have developed a unique and advanced vocal communication strategy in the form of language. To study the evolution of human language, it is necessary to investigate the neural mechanisms underlying vocal processing in humans, as well as to understand how brain mechanisms have evolved by comparing them with those in nonhuman primates. Herein, we developed a method to noninvasively measure the electroencephalography (EEG) of awake nonhuman primates. This recording method allows for long-term studies without harming the animals, and, importantly, allows us to directly compare nonhuman primate EEG data with human data, providing insights into the evolution of human language. In the current study, we used the scalp EEG recording method to investigate brain activity in response to species-specific vocalizations in marmosets. This study provides novel insights by using scalp EEG to capture widespread neural representations in marmosets during vocal perception, filling gaps in existing knowledge.
Primates use species-specific vocalizations to convey biologically important information, such as the caller's emotional state or intention to maintain social bonds, the presence of predators, or other dangerous situations. Investigation of the neural mechanisms underlying the perception of vocalization in vocal-rich nonhuman primates may provide us with critical clues to better understand the evolutionary origins of human language.
Common marmosets are small primates native to South America. In recent years, marmosets have been increasingly used as model animals, alongside macaque monkeys, because of their high reproductivity, ease of use owing to their small size, and the development of useful transgenic techniques1,2,3. In addition to their utility as disease models, rich vocal communication within groups is another unique characteristic of this species4,5,6,7. Marmosets routinely exchange vocal signals to communicate with invisible conspecifics in the forest. By examining the brain activity involved in vocal perception and production in marmosets, we can determine how they process the auditory information of their own or conspecific calls in the brain and identify which neural circuits are involved. Previous studies have demonstrated neural activity in the primary auditory cortex8,9,10,11,12 and frontal cortex13,14 involved in vocal production in marmosets. Furthermore, these excited and suppressed neuronal responses were modulated by auditory-vocal interactions in the primary auditory cortex8,10. These studies provided detailed neural activity data at the single-neuron level using invasive recording methods. Numerous studies have further examined the neural activity involved in marmoset vocal production; however, vocal perception remains poorly understood15,16.
Several noninvasive brain imaging studies have elucidated the neural mechanisms of vocal processing in marmosets17,18,19; their high spatial resolution is an advantage, however, keeping animals in the awake state during scanning requires advanced techniques. However, more recently, Jafari et al. identified frontotemporal regions involved in vocal perception in awake marmosets using functional magnetic resonance imaging (fMRI)19. Almost all experiments to elucidate the brain functions involved in vocal perception and production in humans have been conducted using noninvasive methods, such as scalp electroencephalography (EEG), magnetoencephalography (MEG)20,21, and fMRI22,23,24. Numerous studies in humans have investigated brain activity related to vocal perception using EEG. Most of these studies have focused on emotional information25,26,27 and the saliency of emotional words28, with the results revealing changes in event-related potentials during vocal perception29. Electrocorticography (ECoG) and single-neuron recordings using intracranially implanted electrodes in humans have only been conducted in a limited number of experiments in patients undergoing neurosurgical treatment30,31.
An evolutionary perspective comparing humans with monkeys is important when understanding the unique neural mechanisms underlying vocal perception and production that have developed in humans. To directly compare the neural mechanisms involved in speech perception and vocalization in vocal-rich nonhuman primates, such as the marmoset, with humans, it is important to compare data between the two species using the same method. Functional MRI allows whole-brain imaging and has a high spatial resolution. It has the advantage of recording activity perpendicular to the skull or in deep regions that are difficult to record with EEG or MEG. However, the MRI machine is expensive to install and maintain, and there are many restrictions on the stimuli that can be presented due to the nature of the device. In comparison, EEG, event-related potentials (ERPs), and MEG have a high temporal resolution, making them useful for analyzing time-series vocal processing. In particular, EEG has the advantages of high mobility and the ability to be used in a variety of experimental settings, relatively low cost, and the requirement for just a single operator.
Since a large amount of EEG data has already been obtained in humans, EEG measurement methods using non-invasive paradigms are needed for non-human primates. Our research group developed a unique noninvasive EEG recording method using tubes32 for macaques and marmosets. Here, we report several novel findings regarding auditory processing in nonhuman primates33,34,35,36,37. To characterize brain activity in response to species-specific vocalizations in marmosets, we constructed an experimental system to noninvasively record brain activity using electrodes placed on the scalp. In this study, we describe the EEG measurement method for marmosets.
All experiments were approved by the Animal Experimentation Committee of EHUB (No.2022-003, 2023-104) and conducted in accordance with the Guide for Care and Use of Laboratory Primates published by EHUB. Nine common marmosets (Callithrix jacchus, six males and three females, 2-12 years old, weighing 330-490 g) were used for the experiment.
1. Animals
2. Equipment (Figure 1B and Table of Materials)
3. Anesthesia
4. Hair removal
5. Mask preparation
6. Chair and mask adaptation (30 min/day for 3 days)
7. EEG recording (2 h/day)
8. Data analysis
NOTE: The original code written in the Programming software and toolbox was used to postprocess the EEG data, as outlined below (Supplemental File 2)37.
First, we plotted the average event-related potentials (ERPs) for each auditory stimulus in the marmosets (Figure 2). The auditory evoked potential (AEP) was prominent in the Noise condition, reflecting the clear onset of the stimuli (see Figure 1D). To compare the averaged ERPs between call types and noise stimuli, we applied a one-way analysis of variance (ANOVA) with stimuli as the between-subjects factor in Cz response. We found a significant main effect of stimuli on Cz activity at 13-18 ms, 28-36 ms, and 45-88 ms after stimulus onset, respectively (p < 0.05). Post hoc multiple comparison analysis with Tukey's method showed that the difference was between noise and other calls (p < 0.05), but there was no difference between marmoset call types. The result suggests that differences in brain activity by call type could not be observed from the event-related potentials alone.
Next, we conducted a time-frequency analysis for each subject. Figure 3 shows an example of the time-frequency maps for the Tsik-string call obtained by subject R (elder, Figure 3A) and subject Y (younger, Figure 3B). We found that event-related spectral power increased at a lower frequency of approximately 20-50 Hz immediately after stimulus onset. These responses were prominently observed in the Cz. In contrast, the gamma range power (over 30 Hz) decreased after stimulus onset compared to the baseline period. In addition, this decline lasted for 1 s. No decrease in event-related power was observed in elder individuals over 8 years. In these examples, the elder individuals had a stronger initial response to the call in the vertex region (Cz), while the younger individuals showed a sustained decrease in γ-band activity during the call presentation. The results suggest that there are differences in initial and sustained responses depending on the subject's age.
Finally, we investigated the relationship between subject age and event-related spectral perturbation (ERSP) power in the initial transient response (Figure 4A) and sustained response (Figure 4B). A two-way ANOVA with Stimulus type as a within-subject factor and age as a between-subject factor was conducted to determine the contribution of the type of auditory stimulus and the subject's age to EEG activity. The initial, transient responses in the Fz showed significant main effects of Stimulus type (F (3,24) = 9.020, p < 0.001) and Age (F (8,24) = 3.934, p = 0.004). However, there was no significant interaction between the Stimulus type and Age (p = 0.144). In the transient responses in Cz, there was a significant main effect of Stimulus type (F (3,24) = 8.533, p < 0.001), but no effect of Age (F (8,24) = 2.215, p = 0.073), and no interaction (p = 0.228). For sustained responses, there were significant main effects of both Stimulus type and Age (F (3,24) = 9.020, p < 0.001; F (8,24) = 3.934, p = 0.004, respectively) on Fz. No significant interaction was observed (P = 0.144). The sustained responses in Cz showed a significant main effect of Stimulus type (F (3,24) = 8.533, p < 0.001) but no main effect of Age (F(8,24) = 2.215, p = 0.073) or interaction (p = 0.228). These results suggest that in the middle-frontal area (Fz), initial transient responses to call and noise stimuli varied greatly with increasing age, and sustained responses were suppressed in younger age groups. These may reflect the functional maturation of the frontal region.
Previous neurophysiological studies have reported neuronal responses in the primary auditory cortex during vocalization in marmosets9,38. In addition, more than half of these neurons exhibit an inhibitory response that persists during vocalization9,38. Furthermore, previous electrophysiological studies in nonhuman primates and humans have shown that high gamma band activity in the local field potential (LFP) and ECoG correlates well with firing rates in neurons39,40,41,42. Scalp EEG is a spatiotemporally smoothed version of the LFP, integrated over an area of 1 cm2 or more43. Although the high-gamma component of the EEG has a lower correlation with firing rates than LFP and ECoG, it is thought to code the output signal as an integrated range of several centimeters. Therefore, the sustained decrease in gamma band power observed at Cz in our experiments may reflect the activity of neuronal clusters showing suppressed activity during call emission, which is found in the auditory cortices. In contrast, previous electrophysiological studies have reported that more neurons in the frontal cortex, mainly in the premotor cortex, exhibit excitatory responses during call vocalization13. Interestingly, sustained inhibitory activity was observed even when the Fz was placed in the frontal area in our experiments, although distinct neural mechanisms were observed between vocal perception and production. A recent fMRI study has further identified several subregions in the frontal cortex, including the anterior cingulate cortex as well as the premotor cortex, as 'vocal patches' that respond to species-specific calls in marmosets19. Our results reflect the overall brain activity in these regions.
In the current experiment, we used only midline positions for the exploration electrodes (Fz, Cz, Pz, and Oz); therefore, we cannot mention any differences between the right and left EEG activity of auditory processing. In the future, we need to investigate the laterality of the neural activity underlying vocal processing.
Prior studies have reported that low gamma activity in the LFP and ECoG is generated by synaptic inputs to pyramidal cells. Thus, a high gamma activity reflects a signal component closer to the output, whereas a low gamma activity reflects those closer to the input39. In the present study, we observed transient activity in the beta and low-gamma bands immediately following exposure to a call. These responses may reflect sensory input signals to the cortex. The advantage of our method is that it can capture the brain activity from different neuronal populations as dynamic changes with high temporal resolution. To our knowledge, this is the first study to reveal how scalp EEG signals change during species-specific vocal perception in marmosets. The present results provide new insights into the integration of neural representations through recordings of a wide range of brain regions.
Figure 1: Experimental setup. (A) An exemplar image of a subject during recording. The marmoset is seated in a chair and the head is fixed to the chair by a mask. The mouth is maintained open to facilitate breathing and drinking reward fluids, and the electrodes were attached to the top of the head. (B) Equipment: Sound stimuli are presented through a speaker. An amplifier, electrode input box, and a monitoring camera were also installed. (C) Location of electrodes: Electrodes were placed on Fz, Cz, Pz, Oz, A1, and A2 according to the International 10-20 System. We defined the location of electrodes using the inion, nasion, and bilateral preauricular points as anatomical landmarks. The C3 or C4 electrode was used as the ground electrode. (D) Sonograms (left panel) and spectrograms (left panel) for all audio stimuli. The Phee call is a single long call lasting less than 2 s. The Tsik-Ek call is a combination call of a Tsik followed by an Ek. Two sets of the compound call presented approximately 1 s. The Tsik-string call is a repetitive call of Tsik5, and four Tsik were presented approximately for 1 s in the stimulus. As a non-call stimulus, we used a white noise signal generated by a custom script that lasted about 1 s. The sound onset latencies were visually inspected on a digital audio editor. The red arrows indicate each latency, Phee call 49 ms, Tsik-Ek call 35 ms, Tsik-string call 16 ms, and white noise 0 ms. Please click here to view a larger version of this figure.
Figure 2: Grand-averaged event-related potentials to the Phee, Tsik-Ek, and Tsik-string calls and noise. (n = 9) The activity was aligned to the onset of each call or white noise. The black horizontal arrow indicates the period for the stimulus presentation. The Phee call lasted for approximately 2 s, the rest lasted for 1 s. The red horizontal lines at Cz indicate the periods with significant differences between the auditory stimuli. All of these were between the Noise and the other calls, and there were no differences between the calls. Please click here to view a larger version of this figure.
Figure 3: Example of time-frequency maps for calls. (A) The ERSP map for the Tsik-string call in subject R (145 months old); (B) The EPSP map for the same Tsik-string call in subject Y (23 months old). The left and right panels show the data recorded from the Fz and Cz electrodes, respectively. The red vertical lines indicate the timing of the onset of the auditory file, not the call onset. Abbreviation: EPSP = event-related spectral perturbation. Please click here to view a larger version of this figure.
Figure 4: Relationship between mean event-related power and age of subjects. (A) Relationship between mean event-related power at α and β-band and age of subjects. The mean ERSP power was calculated at 8-29 Hz from a 1-150 ms period from each stimulus onset compared to those in the baseline period (-200 to 0 ms before stimulus onset). The left and right panels show the data recorded from the Fz and Cz electrodes, respectively. (B) Relationship between mean event-related power at γ-band and age of subjects. The mean ERSP power was calculated at 30-100 Hz from a 151-950 ms period from each stimulus onset compared to those in the baseline period (-200 to 0 ms before stimulus onset). Thus, negative ERSP values indicate a decrease in power compared to that before the stimulus presentation. Abbreviation: ERSP = event-related spectral perturbation. Please click here to view a larger version of this figure.
Supplemental File 1: A zip file containing four audio files used in the experiments. The PH.wav file includes a stimulus with one long Phee call; TE.wav contains two Tsik-Ek calls with an interval between them; MB.wav contains a Tsik-string call stimulus with four consecutive Tsik (also called mobbing call); NO.wav contains a program-generated white Gaussian noise. Please click here to download this File.
Supplemental File 2: A zip file containing postprocessing code. Please click here to download this File.
Points to note about anesthesia
Both ketamine and xylazine administration have been attempted, and while these are analgesic and therefore suitable for long painful tasks, marmosets tend to experience decreases in blood oxygen levels without oxygen inhalation44. In short, alfaxalon is probably best suited for painless tasks such as shaving or mask making. In addition, for shaving-, which takes only 10-15 min, inhalation anesthesia would be the most suitable. Isoflurane was not used during intubation due to its short duration and low concentration (approximately 1%).
Advantages of EEG measurement
Scalp EEG is a method of recording brain activity through the skin, skull, and subcutaneous tissues, which has a lower spatial resolution than recording methods targeted at single neurons or intracranial electrodes with higher spatial resolution. Despite these disadvantages, there are several advantages to measuring scalp EEG in nonhuman primates. First, the method is noninvasive and does not injure the animals. For example, one may want to measure the brain activity of very valuable animal models of a disease created using a very difficult genetic engineering procedure. In such cases, these techniques can be safely used to measure brain activity without requiring surgical intervention. Indeed, this method enabled us to acquire neural data from the same subject over a long period without injury. In addition, the number of subjects can be increased compared with invasive neurophysiological methods because it is easier and less time-consuming. Second, EEG data obtained from nonhuman primates using this technique can be directly compared with previously obtained human EEG data using the same behavioral paradigm as humans. This advantage allowed us to examine cognitive neural activity from an evolutionary perspective. Third, our recording method allows recordings from awake primates. Anesthetic agents significantly attenuate cortical activity, although the effects of attenuation, degree, and pattern of inhibition vary depending on the agents. This method can record brain activity while awake, without any influence of drugs. Furthermore, by administering drugs that act on the central nervous system and observing changes in brain activity, this technique can be applied in experiments involving neural mechanisms.
Functional brain imaging is another noninvasive method for exploring the brain mechanisms associated with vocal processing in humans and nonhuman primates. Recent technological advances have enabled the application of fMRI in animals in an awake state. Functional brain imaging has the advantage of allowing whole-brain exploration with high spatial resolution. However, the equipment and its running costs are expensive to install and maintain. Conversely, EEG has the advantage of high temporal resolution and reveals more dynamic frequency-specific brain activity. In addition, the equipment has a relatively low cost and high portability and can be performed by a single experimenter. Furthermore, various stimuli and equipment could be easily introduced. Taking advantage of these methods and integrating findings from different techniques will provide a more detailed understanding of the neural mechanisms involved in vocal perception in marmosets.
Comparison with human or macaque data
In EEG, the signals generated in the brain must pass through the dura mater, subcutaneous tissue, and skull before being recorded on the scalp. These tissues act as low-pass filters. In particular, high-frequency components such as spikes are significantly attenuated compared to the alpha and beta band components. Therefore, unlike electrocorticography (ECoG), some components of brain activity cannot be detected on the scalp. This is a limitation of scalp EEG. Marmosets have thinner skulls, subcutaneous tissues, and dura mater than macaque monkeys and humans, while the size of the head is also smaller. Therefore, care must be taken when comparing results obtained in humans and macaques. For example, signals that originate in the deep brain, such as the brainstem, can be recorded on the scalp as relatively large signals because of the lower attenuation rate and relatively close distance between the signal source and electrodes.
In addition, marmosets have very few, if any, shallow cerebral sulci. This makes them quite distinct from humans, who have many sulci, and in whom the cortex is internalized as the gyrus. The number and depth of the cerebral sulci differ among humans, macaques, and marmosets. Electrical signals are generated perpendicular to the cortex and measured using electrodes placed over the cerebral gyri, where the apical dendrites of pyramidal neurons, which are the major sources of EEG signals, are aligned perpendicular to the electrodes. In the sulci, signals are attenuated and projected onto distant electrodes because they are generated horizontally with respect to the brain surface. When comparing results in humans with those in macaques and marmosets, it is necessary to consider the location of the EEG recordings and the anatomy of the area.
The authors have nothing to disclose.
This work was supported by the Hakubi Project of Kyoto University, Grant-in-Aid for Challenging Research (Pioneering) (No.22K18644), Grant-in-Aid for Scientific Research (C) (No. 22K12745 ), Grant-in-Aid for Scientific Research (B) (No. 21H02851), and Grant-in-Aid for Scientific Research (A) (No. 19H01039). We would like to thank Editage (www.editage.jp) for English language editing.
Alfaxalone | Meiji Animal Health | Alfaxan | |
Amplifier | Brain Products | BrainAmp | |
Atropine | Fuso Pharmaceutical Industries | Atropine Sulfate Injection | |
Audio editor | Adobe | Adobe Audition | |
Data processing software | MathWorks | MATLAB | version R2023a |
Data processing toolbox | University of California-SanDiego | EEGLAB | |
Data processing toolbox | University of California-Davis | ERPLAB | |
Electric shaver | Panasonic | ER803PPA | |
Electrode | Unique Medical | UL-3010 | AgCl coated (custom) |
Electrode gel | Neurospec AG | V16 SuperVisc | |
Electrode input box | Brain Products | EIB64-DUO | 64ch |
Glue | 3M | Scotch 7005S | |
Hair removering cream | Kracie | epilat for sensitive skin | |
Isoflurane | Bussan Animal Health | ds isoflurane | |
Liquid gum | San-ei Yakuhin Boeki | Arabic Call SS | Gum arabic+water |
Liquid nutrition | Nestlé Health Science Company | Isocal 1.0 Junior | Polymeric formula |
Maropitant | Zoetis | Cerenia injectable solution | |
Monitor Camera | Intel | RealSense LiDAR Camera L515 | |
Monkey pellets | Oriental Yeast | SPS | |
Primate chair | Natsume Seisakusho | Order made | |
Pulse oximeters | Covident | Nellcor | PM10N |
Skin prepping pasta | Mammendorfer Institut für Physik und Medizin | NeuPrep | |
Slicon tube | AsONE | Φ4 x 7mm | |
Speaker | Fostex | PM0.3 | |
Synchronization device | Brain Vision | StimTrak | |
Thermoplastic mask | CIVCO | MTAPU Type Uniframe Thermoplastic Mask 2.4mm |
.