A protocol to assess the time evolution of the neural entrainment to external repetitive stimuli is presented. Steady-state recordings of the same experimental condition are acquired and averaged in the time-domain. The steady-state dynamics are analyzed by plotting the response amplitude as a function of time.
Neural entrainment refers to the synchronization of neural activity to the periodicity of sensory stimuli. This synchronization defines the generation of steady-state evoked responses (i.e., oscillations in the electroencephalogram phase-locked to the driving stimuli). The classic interpretation of the amplitude of the steady-state evoked responses assumes a stereotypical time-invariant neural response plus random background fluctuations, such that averaging over repeated presentations of the stimulus recovers the stereotypical response. This approach ignores the dynamics of the steady-state, as in the case of the adaptation elicited by prolonged exposures to the stimulus. To analyze the dynamics of steady-state responses, it can be assumed that the time evolution of the response amplitude is the same in different stimulation runs separated by sufficiently long breaks. Based on this assumption, a method to characterize the time evolution of steady-state responses is presented. A sufficiently large number of recordings are acquired in response to the same experimental condition. Experimental runs (recordings) are column-wise averaged (i.e., runs are averaged but epoch within recordings are not averaged with the preceding segments). The column-wise averaging allows analysis of steady-state responses in recordings with remarkably high signal-to-noise ratios. Therefore, the averaged signal provides an accurate representation of the time evolution of the steady-state response, which can be analyzed in both the time and frequency domains. In this study, a detailed description of the method is provided, using steady-state visually evoked potentials as an example of a response. Advantages and caveats are evaluated based on a comparison with single-trial methods designed to analyze neural entrainment.
When recorded from the scalp, brain electrical activity is observed as continuous and regular changes in voltages over time. This electrical activity is called electroencephalogram (EEG) and was first described by Hans Berger in the late twenties of the last century1. Subsequent seminal studies described the EEG as a compound time series, in which different rhythmic or repetitive patterns can be observed2,3,4. Nowadays, the EEG is typically divided into five well-established frequency bands, delta, theta, alpha, beta, and gamma, which are associated with the different sensory and cognitive process.
For years, the study of brain oscillations using EEG was restricted to either analysis of the spectrum in the ongoing activity or changes in oscillatory activity elicited by non-periodic sensory events. In the last decades, different methodologies have been implemented for modulating ongoing EEG oscillations and exploring the effects of such modulations on perceptual and cognitive processes, including the presentation of rhythmic sensory stimulation for inducing neural entrainment. The term neural entrainment refers to the synchronization of neural activity with the periodic properties of sensory stimuli. This process leads to the generation of steady-state evoked potentials (i.e., EEG oscillations locked to the periodic properties of the driving stimuli). Steady-state evoked potentials are most commonly elicited by visual, auditory, and vibrotactile stimulation, using either transient stimuli presented at a constant rate or continuous stimulation modulated in amplitude at the frequency of interest. Whereas somatosensory steady-state evoked potentials (SSSEPs) are recorded in response to repetitive tactile stimulation5,6, steady-state visually evoked potentials (SSVEPs) are generally elicited by the periodic presentation of luminance flickers, pictures, and faces7,8. Auditory steady-state responses (ASSRs) are usually generated by trains of transient acoustic stimuli or by the continuous presentation of amplitude-modulated tones9,10.
The extraction of steady-state evoked potentials from the measured EEG essentially relies on averaging subsequently acquired EEG epochs time-locked to the stimulus11. Due to the periodicity of the responses, they can be analyzed in both time and frequency domains. After the frequency-domain transformation, the sensory response is observed as peaks of amplitude at the presentation rate or modulation frequency of the external stimuli, and their corresponding harmonics. These procedures (time-domain averaging and the subsequent frequency-domain transformation) have been essential for developing a hearing test based on the detection of ASSR methods with clinical purposes12,13,14,15,16.
Furthermore, the classical time-domain averaging of EEG epochs has been extremely useful for analyzing physiological processes such as the generation and extinction of SSVEP17,18. Presenting consecutive trains of flicker lights and averaging subsequent epochs within a recording, Wacker et al.19 observed that the phase-locking index of the SSVEP rapidly increased during the first 400 ms of stimulation and remained high afterwards. They also reported that robust visual entrainment was established between 700-1 100 ms after stimulus onset. A certain degree of entrainment remained effective after the offset of the stimulation train, which lasted approximately three periods of the oscillatory response17,19. Those behaviors have been interpreted as the engaging/disengaging effect of the observed oscillations, which is a consequence of the nonlinear information processing in the human visual system17. Alternatively, it is known that under certain experimental conditions, the flicker stimulation can elicit on-responses at the beginning, and off-responses at the end of stimulation trains instead of neural entrainment18.
The main assumption to average consecutively acquired EEG epochs is that the EEG signal represents a linear combination of the sensory response and the background noise20. Furthermore, the amplitude, frequency, and phase of the oscillatory response are assumed to be stationary, whereas the background noise is considered as a random activity. However, in cases in which this assumption is not met, the response amplitude computed after several epochs do not necessarily correspond to the instantaneous amplitude of the evoked potential.
It has been recently reported that the ASSR generated in the brainstem of rats adapts to the continuous presentation of amplitude-modulated tones (i.e., the response amplitude decrease exponentially over time)21,22. Adaptation has been interpreted as a neural mechanism that reflects the loss of novelty of a monotonously repetitive sensory stimulus, increasing the sensitivity to relevant fluctuations in the acoustic environment23,24. In the auditory pathway, adaptation may enhance speech comprehension in noisy environments. Furthermore, this process may be a part of existing mechanisms to monitor the auditory feedback of one's own voice to control the speech production.
Analyzing the time evolution of the 40 Hz ASSR in humans, Van Eeckhoutte et al.25 observed a significant but small decrease in the response amplitude over time (around -0.0002 µV/s based on the group analysis, when assuming a linear decrease over time). Consequently, these authors concluded that the 40 Hz ASSR in humans does not adapt to the stimulation. In humans, non-stationary behaviors have been observed when analyzing the stability of the SSVEP26. These authors observed that the amplitude of the fundamental frequency and the second harmonic of the SSVEP were stationary in only 30% and 66.7% of the subjects they tested, respectively. The phases of both SSVEP frequency components, although relatively stable over time, exhibited small drifts26.
Therefore, although the classical time-domain averaging of subsequently acquired epochs allows exploring of stationary properties of the neural entrainment, this methodology needs to be revised when long-term dynamics of the entrainment is the focus of the research, or when the averaging of short-term dynamics is corrupted by the occurrence of long-term dynamics. To characterize non-stationary behaviors of the steady-state responses, the evoked response computed at a given time window should not be compromised by those computed in the preceding EEG segments. In other words, the evoked potential should be extracted from the background noise without epochs being time-domain averaged with the preceding EEG segments.
In this study, a method for assessing the dynamics of neural entrainment is presented. Steady-state responses are repetitively recorded in response to the same stimulation, where consecutive recordings are interleaved by a resting interval of three times the length of the experimental run. Considering that if the time evolution of the physiological response is the same in different independent experimental runs (independent recordings), recordings are column-wise averaged. In other words, epochs corresponding to the same location in the different recordings are averaged, without averaging epochs within a recording. Consequently, the response amplitude computed at any stimulation interval will correspond to the instantaneous amplitude of the evoked potential. The sensory responses can be either analyzed in the time-domain or transformed into the frequency-domain, depending on the aim of the experiment. In any case, the amplitudes can be plotted as a function of time to analyze time evolution of the steady-state response. Generation and extinction of the steady-state evoked potentials can be assessed by restricting the analysis to the first and last epochs of the recordings.
The dynamics of the neural entrainment can be analyzed using other approaches, such as narrowband filtering single-trial measurements around the frequency of interest and computing the envelope of the power signal using low-pass filtering25 and the Hilbert transformation27. Compared to these methodologies, the column-wise averaging of epochs allows computing steady-state parameters based on signals with the higher signal-to-noise ratio (SNR). Recently, Kalman filtering has emerged as a promising technique for the estimation of 40-Hz ASSR amplitudes28,29,30. Implementation of Kalman filtering can improve the detection of steady-state responses closer to the electrophysiological threshold and reduce the time of the hearing test29. Furthermore, stationary responses are not needed to be assumed when a Kalman filtering approach is used to estimate the ASSR amplitude30. Nevertheless, only one study has analyzed the time evolution of ASSRs using Kalman filtering25. The conclusion of the study is that the 40-Hz ASSR amplitude is stable over the stimulation interval. Therefore, Kalman filtering needs to be tested in conditions under which the ASRR is not stationary.
Although time consuming, the column-wise averaging method is model-free and does not need initialization values and/or a priori definitions of the noise behavior. Furthermore, since it does not involve convergence times, the column-wise averaging may provide a more reliable representation of the onset of neural entrainment. Therefore, the results obtained with the column-wise averaging method can be considered as the ground truth for analyzing dynamics of the neural entrainment using Kalman filtering.
This description of the protocol is based on an example of SSVEP. However, it is important to note that the method presented here is modality-independent, such that it can also be used to analyze the time evolution of SSSEP and ASSR.
The present study was performed under approval of the Research and Ethics Committee of the Universidad de Valparaíso, Chile (assessment statement code CEC170-18), confirmed to the national guidelines for research with human subjects.
1. Preparation
2. Subject preparation
3. EEG acquisition and pre-processing
4. Computation of the response amplitudes
SSVEP was elicited by continuous visual stimuli of 40 s in length, where the light intensity was modulated by a sinusoidal wave of 10 Hz (modulation depth of 90%). Stimuli were delivered by four light-emitting diodes (LEDs) situated in the center of a 50 cm x 50 cm black screen, as vertexes of a 5 cm x 5 cm square. When the participant sat 70 cm from the screen, the area of the square of LEDs subtends a visual angle of about 4°. The LED screen was designed using an USB-based microcontroller development system and four super bright white LEDs of 10 mm of diameter. The pulse width modulation (PWM) technique was used to control the power supplied to the LEDs. This technique controlled the LEDs intensities at a given frequency and generate the final sinusoidal envelope. A PWM frequency of 40 kHz was used to avoid a perceivable flicker effect.
Thirty recordings were obtained, which were segmented in epochs of 4 s. Therefore, a dataset composed of 10 columns (number of EEG epochs within recordings) and 30 rows (number of recordings, number of experimental runs) was obtained.
The neural oscillation time-locked to the stimulation became evident as the column-wise averaging was performed (Figure 2). Significantly, the interval at which the SSVEP is generated can be observed in traces corresponding with column 1. In that column, 0.2 s of pre-stimulus baseline are plotted in addition to the first 0.8 s of neural entrainment. Therefore, the procedure described here allows characterization of 1) the dynamics of the oscillatory response once neural entrainment is already established and 2) the engagement of neural oscillations. One or more epochs recorded after the end of stimulation can also be included in the data matrix to study extinction of the steady-state response after stimulus offset.
During the column-wise averaging of epochs, the mean amplitude of the SSVEP (spectral amplitude at 10 Hz, computed by applying the FFT) decreased during the averaging of the first epochs of the columns and tended to stabilize afterward (Figure 3A). This result agrees with previous studies analyzing the evolution of ASSR during the averaging of sequentially acquired epochs21,22,40,43,44. The behavior of the response amplitude during averaging is usually explained by the relatively high contribution of unaveraged noise to the response amplitude computed in the first epochs, which is attenuated as averaging is performed13,44,45,46,47. Noteworthy, the SSVEP amplitude variability significantly decreased as averaging progressed.
We also analyzed the RNL of the measurements during the column-wise averaging of epochs (Figure 3B). The RNL was computed in a narrow frequency band (3 Hz) at both sides of the frequency of the SSVEP. Although this procedure is not common when SSVEP are analyzed, vector-averaging a given number of frequency bins around that of the neural entrainment is the standard for estimating the RNL in ASSR measurements41,42,43. As expected, the RNL progressively decreased as the number of averaged epochs increased and reached the asymptotic level after about 20 epochs were processed. Unlike that observed when the SSVEP amplitude was analyzed, the standard deviation of the RNL remained relatively constant as the number of averaged epochs increased, which suggests that the recording conditions were stable along the experimental session.
The results presented above determined the changes in the peak signal-to-noise ratio (pSNR) of measurements during the column-wise averaging of epoch (Figure 3C). This term is defined here as the ratio (in dB) between the square amplitude of the response (SSVEP) and square amplitude of the RNL. As averaging progressed, the pSNR increased as the number of averaged epochs increased up to 18, approximately. Further increments in the number of averaged epochs did not significantly impact the quality of the signal. The variability of the pSNR decreased as more epochs were averaged.
Finally, the dynamics of the SSVEP amplitude and the RNL are represented in Figure 4. These time evolutions were obtained by plotting the response parameters computed at the end of the column-wise averaging of epochs as a function of the number of columns (as a function of time). As demonstrated by Labecki et al.26, the dynamics of SSVEP can significantly vary among subjects. Since the results presented in Figure 4 correspond to a single individual, generalizations cannot be made. In this subject, the amplitude of the SSVEP displayed a relatively complex behavior (Figure 4A). The response amplitude gradually increased during the first 12 seconds following the stimulus onset (time which corresponds to the length of 3 epochs). As the stimulus persisted, the SSVEP consistently decreased during the following 12 seconds, and remained relatively constant afterwards. These results cannot be explained by the behavior of the RNL, since this parameter was relatively constant during the stimulation interval (Figure 4B). The increase in the SSVEP amplitude following the stimulus onset is evident in the traces presented in Figure 2 and can be explained by integration processes, which result in stabilization of the neural entrainment. The subsequent decrease in amplitude suggests the adaptation of SSVEP to the sustained stimulation. Nevertheless, these hypotheses need to be tested in controlled experiments with appropriated sample sizes.
Figure 1: Critical steps for extracting the time evolution of the amplitude of steady-state responses. (A) Screenshot of the processing code, where analysis parameters are defined. (B) Representative diagram illustrating the organization of the dataset. A data matrix composed of 30 recordings of 10 epochs is represented. The column-wise averaging of epochs is highlighted in the first column. The vertical line represents the direction of the averaging. Please click here to view a larger version of this figure.
Figure 2: Changes in the waveform of steady-state visually evoked potentials (SSVEP) during the column-wise averaging of epochs. Responses were elicited by the continuous presentation of light modulated in amplitude at 10 Hz. The rows show the waveforms obtained after averaging all previous recordings (i.e., row 1 is the first recording, row 5 is the waveform obtained after averaging the first five recordings, and the last row is the average of all recordings). More reliable waveforms of SSVEP were observed in each column as the number of averaging runs increased. To provide clarity (to make the oscillations of the SSVEP visible), only the first second of the epochs is represented. The exceptions are traces in the first column of the data set, for which 0.2 seconds of pre-stimulus baseline are displayed. Please click here to view a larger version of this figure.
Figure 3: Changes in the response and recording parameters during the column-wise averaging of epochs. (A) Evolution of the SSVEP amplitude. (B) Behavior of the RNL. (C) Changes in the pSNR. Black lines represent the mean values obtained for each column (n = 10) and the grey shadow represents the area covered by ± one standard deviation. Please click here to view a larger version of this figure.
Figure 4: Time evolution of the SSVEP elicited by the presentation of continuous visual stimulation, modulated in amplitude at 10 Hz. (A) Time course of the SSVEP amplitude. (B) Time course of the RNL. Please click here to view a larger version of this figure.
This work describes an experimental procedure for analyzing the dynamics of oscillatory brain responses. Such methodology consists of acquiring a sufficient number of independent experimental runs of the same experimental condition, and time-domain averaging epochs corresponding to the same time window in the different recordings (columns-wise averaging in Figure 1B). The amplitude computed in the averaged data represent the instantaneous amplitude of the oscillatory response. Plotting these amplitudes as a function of time (or the number of columns in the dataset) allows analyzing the time evolution of the oscillatory response time-locked to the stimulation. This methodology is a modification of that proposed by Ritter et al.23 for analyzing the adaptation of transient cortical evoked potentials. The method has been used to analyze the dynamic of auditory evoked potentials in both humans24 and animal models20,21.
From a methodological point of view, the combination of parameters used to elicit the steady-state response and those implemented to extract the neural response from background noise is critical to analyze the time evolution of steady-state evoked potentials22. The stimulus length used in the experiment presented here (40 s) was selected based on results obtained in a pilot study. This stimulus length was sufficient to analyze the adaptation of ASSR generated in the rat brainstem21,22. Furthermore, the stimulus length should exceed the time at which the asymptotic instantaneous band power of SSVEPs is reached (Figure 1 in Labecki et al.26). Nevertheless, the asymptotic instantaneous band power of SSVEPs can be reached beyond 60s in some cases (Figure 2 in Labecki et al.26). Therefore, running a small-sample pilot study is recommended to define the stimulus length of the stimulation. Otherwise, a stimulus length longer than 90 s is recommended to achieve complete representation of the time evolution of the response. Using adequately long pauses between consecutive recordings implies considering consecutive experimental runs as to be statistically independent (i.e., different, independent measures of the same variable). To the best of our knowledge, no experiments have been performed to analyze the optimum pause between runs (minimum pause required to make runs independent from each other). The criterium of using pauses at least 3x longer than the stimulus length is conservative enough to ensure that the steady-state response recorded in any given run is not affected by the previous stimulation.
Recently, alternating stimuli (experimental conditions) has been proposed as a choice to reduce the pause between experimental runs, avoiding extra adaptation effect25. Likewise, the number of experimental runs (30) implemented in this experimental protocol is conservative, since the asymptotic RNL and pSNR are typically reached after averaging 20 experimental runs, approximately. When stimuli fall within the middle-upper region of the dynamic range of the response (high sensation levels), lower numbers of runs are likely needed to analyze the dynamics of the evoked response. Nevertheless, in cases in which different experimental conditions are tested, having the same number of experimental runs is crucial for making comparisons among conditions (i.e., different sensation levels).
In addition to the column-wise averaging of epochs, the dynamics of oscillatory evoked potentials has been analyzed by filtering the single-trial measurements in a narrow frequency band around the frequency of interest and computing the envelope of the power signal using low-pass filtering26. Likewise, single trial analysis has been implemented to characterize the transition period that precedes the stable region of SSVEP48, and the changes in amplitude and phase of the SSVEP during the stable region of the response49. While single trial analyses allow discrimination of relatively fast fluctuations in response amplitude, experimental designs to analyze the average response in blocks separated with a given inter-block interval only account for long-term variations in the amplitude of the evoked potential50,51. The column-wise averaging of epochs stands between these two options. Converting the averaged signal to the frequency-domain using the FFT implies analyzing the dynamics of the response with a resolution equal to the length of the epoch. In the example presented here, the SSVEP was reported every 4 s. Although 4 s of resolution is adequate to describe dynamics occurring at intervals of time surpassing tens of seconds, such as that of the SSVEP26, partially overlapping epochs in the original recordings allows to describe the time evolution of the steady-state response in a more refined manner25.
Dynamics of the steady-state responses obtained after column-wise averaging of epochs mainly represent evolution of the oscillatory activity that is synchronized among the averaged EEG segments (those which survive the averaging). Therefore, a major issue regarding the feasibility of the methodology is the possible attenuation of response amplitudes due to variations in the phase of neural oscillations from one independent experimental run to another (i.e., among recordings). This topic needs to be addressed experimentally. However, evidence indicates that the phase of brain oscillatory responses is less variable than expected. In fact, several studies have reported a regularity in the expected phase of the human 80 Hz ASSR47,48,49. When latencies are estimated based on the phase of the oscillatory activity, the predictable effect of the intensity and the carrier frequency of the acoustic stimuli on the latency of the auditory responses has been observed (i.e., the latency decrease as the intensity and carrier frequency increase)52,53,54. Furthermore, typical maturational changes in amplitude and the left-to-right asymmetry in the hearing levels have been also observed when latencies are estimated from the phase of the ASSR47,55,56,57,58. When describing the time evolution of SSVEP using single-trial analysis, Labecki et al.26 observed that although inter-trial variability of the response amplitudes within the same subject was considerably high, variability of the phase was significantly less pronounced.
Based on their observations, Labecki et al.26 suggested that a minimum of 50 trials should be averaged to obtain a reliable estimation of the mean power envelope of the response. These results indicate that, even when the amplitude of the response is computed in single trials, averaging (of envelopes in that case) is needed to report trustworthy results. Moreover, the inter-trial variability in the amplitude of SSVEP reported by Labecki et al.26 suggests that the computation of this parameter in single trials can be highly influenced by background noise. Considering the evolution of the signal-to-noise ratio presented in Figure 2, the computation of the response in the averaged signal instead of single trials significantly reduces the number of EEG segments needed to be processed to obtain reliable measurements. Additionally, the low variability in phase obtained by Labecki et al.26 supports the idea that the column-wise averaging of epochs presented here is a valid procedure for computing the dynamics of oscillatory evoked potentials.
Averaging the data at different levels leads to different interpretation of the results. Regarding oscillatory evoked potentials, computing the response amplitude after the time-domain averaging of independent runs implies analyzing only time-locked oscillations (i.e., those that survive the averaging). This procedure may filter relevant information regarding the dynamics of the response in individual trials. However, it guaranties a sufficiently high signal-to-noise ratio of the measurements. This aspect might be of significance when the responses are close to the electrophysiological threshold, a condition in which the detection of the entrainment can be compromised due to low signal-to-noise ratio of the measurement.
The authors have nothing to disclose.
The authors gratefully acknowledge Lucía Zepeda, Grace A. Whitaker, and Nicolas Nieto for their contributions to video production. This work was supported in part by CONICYT programs BASAL FB0008, MEC 80170124 and PhD scholarship 21171741, as well as the National Institute on Deafness and Other Communication Disorders of the National Institutes of Health under award number P50DC015446. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Active electrodes | Biosemi | P32-1020-32ACMS (ABC) | for channels 1-32 |
Active electrodes | Biosemi | P32-1020-32A (ABC) | for channels 33-64 |
Active electrodes | Biosemi | 8 x TP FLAT | external electrodes |
Active-Two adquisition system | Biosemi | version 7.0 | EEG adquisition system |
alcohol | Salcobrand | Code: 3309011 | for cleaning the scalp |
Electrode cap 64 channels | Biosemi | CAP MS xx yy | cap |
Electrode cap 64 channels | Biosemi | CAP ML xx yy | cap |
gel | Biosemi | SIGNA BOX12 | conductive gel |
Laptop | Asus | Core i7 1TB DD + 128GB SSD 8GB RAM | computer for stimulation |
Laptop | Asus | Core i7 1TB DD + 128GB SSD 8GB RAM | computer for recording |
LED screen | in-house production | – | The screen consists of four light-emitting diodes (LEDs) situated on the center of a 50×50 cm black screen, as vertexes of a square of 5×5 cm |
sterile gauze | Salcobrand | Code: 8730277 | for cleaning the scalp |