Learning new stimulus-response associations engages a wide range of neural processes which are ultimately reflected in changing spike output of individual neurons. Here we describe a behavioral protocol allowing for the continuous registration of single-neuron activity while animals acquire, extinguish, and reacquire a conditioned response within a single experimental session.
While the subject of learning has attracted immense interest from both behavioral and neural scientists, only relatively few investigators have observed single-neuron activity while animals are acquiring an operantly conditioned response, or when that response is extinguished. But even in these cases, observation periods usually encompass only a single stage of learning, i.e. acquisition or extinction, but not both (exceptions include protocols employing reversal learning; see Bingman et al.1 for an example). However, acquisition and extinction entail different learning mechanisms and are therefore expected to be accompanied by different types and/or loci of neural plasticity.
Accordingly, we developed a behavioral paradigm which institutes three stages of learning in a single behavioral session and which is well suited for the simultaneous recording of single neurons' action potentials. Animals are trained on a single-interval forced choice task which requires mapping each of two possible choice responses to the presentation of different novel visual stimuli (acquisition). After having reached a predefined performance criterion, one of the two choice responses is no longer reinforced (extinction). Following a certain decrement in performance level, correct responses are reinforced again (reacquisition). By using a new set of stimuli in every session, animals can undergo the acquisition-extinction-reacquisition process repeatedly. Because all three stages of learning occur in a single behavioral session, the paradigm is ideal for the simultaneous observation of the spiking output of multiple single neurons. We use pigeons as model systems, but the task can easily be adapted to any other species capable of conditioned discrimination learning.
Learning new stimulus-response-outcome associations engages a wide range of neural plasticity processes. These processes are ultimately reflected in the changing spike output of individual neurons. Arguably, one of the most frequently employed learning paradigms is Pavlovian fear conditioning conducted with rodents. In this setting, the acquisition and extinction of a conditioned response take place within a few dozen trials2. The rapid development of conditioned fear can be advantageous because it allows running a large number of animals within a short time. Also, acquisition and extinction can be observed within a few tens of trials on a single day in naive animals3,4 or spread across 2 to 3 days2,5-8. However, the insights gained about the changes of neural activity during learning in these experiments do not necessarily apply outside the domain of fear conditioning. For example, goal-directed behavior driven by positive reinforcement is more adequately modeled by operant rather than Pavlovian conditioning procedures, and may in part depend on different neural substrates9,10. Also, fear conditioning develops so rapidly that neural responses to the CS can only be observed for a few dozen trials, placing severe limits on the analysis of changes of neural activity during learning.
Unfortunately, the acquisition and extinction of operant responding usually takes many days. This is detrimental for neurophysiological investigations, because it is notoriously difficult to record the activity of single cells over more than a few hours. Due to the high similarity of the waveforms of extracellularly recorded action potentials, it is problematic to claim that spikes recorded on one day are generated from the same cell as spikes with similar waveforms recorded on the next11,12, especially in areas with a high cell density such as the hippocampus.
To address these issues, we developed a novel behavioral paradigm utilizing 3 learning conditions within one experimental session on a single day. This requires that the experimental animal is willing to perform hundreds of trials under varying conditions on a thin schedule of reinforcement. Homing pigeons (Columbia livia forma domestica) are classic model organisms in experimental psychology13-17. These birds are able to perform complex visual discriminations18, can flexibly adapt behavior to changing reinforcement contingencies19,20, and are uniquely avid workers, performing 1,000 trials with minimal amount of reinforcement. These characteristics make them especially suitable for the experiments described below.
Ethics Statement
All experiments were conducted in accordance with the German guidelines for the care and use of animals in science. Procedures were approved by a national ethics committee of the state of North Rhine-Westphalia, Germany.
System overview
Operant Testing Chamber
The operant chamber (Figure 1) measures 34 cm x 34 cm x 50 cm. Three translucent response keys (4 cm x 4 cm, located approximately 20 cm above floor level) are recessed into the back wall of the chamber. Stimuli are shown through an LCD flat screen mounted behind the response keys. Two 2-Watt light bulbs located at the side walls provide dim illumination. The chamber is housed in a sound-attenuating cubicle to mask extraneous sounds. Loudspeakers provide white noise at all times. Food (grain) is provided by a food hopper located below the center key. Experimental hardware is controlled by custom-written MATLAB code21. Animals are constantly monitored through a digital camera attached to the front wall of the chamber.
Custom-built Microdrives
Microdrives housing 16 electrode wires are custom-built in our laboratory; the design is based on work by Bilkey and colleagues22,23, and the reader is referred to these articles for a detailed description. We modified their design to allow for a larger number of electrodes (16 instead of 8; 25 µm nichrome wires), and we connect the electrode wires via conductive silver glue to the headstage socket. Additionally, we use gold-plating of the electrode tips to reduce impedance and to achieve better signal-to-noise ratios (apply -3 V for ~3 sec; impedances should drop to <100 kΩ).
Once the microdrive is assembled, electrodes are cut to the desired length, tips are cleaned in an ultrasonic bath (Tergazyme in distilled water) for 20 min and rinsed another 20 min in distilled water. Gold-plating of electrode tips should take place immediately before implantation. For grounding, we use a silver ball electrode placed above the lateral cerebellum. Specification of materials is provided in the Materials table which accompanies this article.
An important issue when working with freely moving animals is movement artifacts. We found that movement artifacts in our setups are largely due to a) high electrode impedances (>500 kΩ) and b) imperfect attachment of the contacts between the plug (implant) and the socket (headstage) while the animal is moving. A variety of commercially available microconnectors does not perform satisfactorily for recording from freely moving birds, because the mechanical contact between plug and socket rapidly deteriorates through vigorous movements of the pigeons (head-bobbing, key-pecking). The best mechanical connection between implant and headstage was achieved with headplug assemblies from Ginder Scientific. These plug-socket assemblies feature 18 contacts and are firmly affixed to each other by a ring nut.
Electrophysiological Recording Setup
The electrophysiology setup comprises the following components: 1) a custom-built headstage with unity gain (operational amplifier) 2) 15 differential amplifier modules housed in two rack mount units (DPA-2FS and EPMS-07, respectively; npi electronic GmbH, Germany), 3) a 16-channel analog-to-digital converter (power 1401 mark I). Raw signals are amplified 1,000x and band-pass filtered (500-5,000 Hz, 1st order filter), digitized with a sampling rate of 16-20 kHz and stored with Spike2 Version 7.06a for offline processing. Event times (such as stimulus onset or individual key pecks of the animal) are captured via a laboratory-built parallel port IO box (see Rose et al.21) and forwarded to the AD converter for storage along with the neurophysiological data (see Figure 1). Offline processing is described further below.
Figure 1. System overview. Information flow is symbolized by colored arrows. Computer 1 controls hardware pertaining to behavioral output (stimulus display via the flat screen monitor, house light, food hopper, feeder light, response keys) and sends event timestamps to the AD converter. Computer 2 stores neurophysiological signals obtained from the A/D converter and event timestamps received from Computer 1. The photograph on the left shows the conditioning chamber inside the sound-attenuating cubicle. Its elements are: 1) Sound-attenuating shell, 2-4) response keys, 5) food hopper, 6) feeder light 7) house light, 8) observation camera.
Single-Interval-Forced-Choice (SIFC) Discrimination Task
For clarity, we will describe the final SIFC task here and then explain the steps needed to train animals on this task below.
The SIFC task is outlined in Figure 2. After the intertrial interval (ITI) has elapsed, the center key is transilluminated green for up to 5 sec ('initialization phase'). Immediately following the third response of the animal within 5 sec, one out of several sample stimuli is presented on the center key for 2 sec ('sample phase'; example stimuli are shown in the inset to Figure 2). After 2 sec, the center key is again transilluminated green, and the animal has to respond once more before the two side keys are transilluminated ('confirmation phase'). Depending on the identity of the stimulus shown in the sample phase, the animal is required to direct a single response to either the left or the right key ('choice phase'). If it chooses the correct destination, access to reward (grain) is granted for 2 sec. Thus, the core of the task consists of responding to the left choice key after presentation of one particular stimulus on the center key, and responding to the right choice key after presentation of another stimulus. The reason that the sample phase is bracketed by an initialization and a confirmation phase is to keep the animals' head in front of the center key while the sample stimulus is presented.
Once the animal masters this task for a single pair of stimuli (henceforth, 'familiar' stimuli, FS), it is presented with a novel stimulus (NS) pair in every new session, and has to learn which of the two novel stimuli is to be followed by a response to the left or the right choice key. The FS pair continues to be presented during those experiments to serve as suitable control condition. Adequate performance on the final task hinges crucially on the animals' willingness to perform >1,000 trials at overall reinforcement probabilities <0.5. The following paragraphs describe a training procedure in which task complexity is gradually increased until the animal reaches the level of the SIFC; at the same time, reinforcement probability and the number of trials per session need to be increased to ensure consistently high performance on the final task.
1. Animal Training
Figure 2. Illustration of the behavioral paradigm. After an ITI of 5 sec, the center key is transilluminated green for up to 5 sec (initialization). If the animal responds 3x within these 5 sec, 1 out of the 4 sample stimuli is presented at the same position. After a fixed sample presentation time of 2 sec during which the animal has to respond at least once, the central pecking key is transilluminated green again (confirmation). After another peck, the 2 side keys are transilluminated green. The subject indicates its choice by responding once to one of the side keys. During acquisition and reacquisition, correct responses are followed by 2 sec food access accompanied by activation of the feeder light, or activation of the feeder light alone. If incorrect, house lights are turned off for 3 sec. During extinction, both correct and incorrect responses to the extinction stimulus remain inconsequential. Inset shows example novel and familiar stimulus pairs.
2. Electrophysiology
Figure 3. Quality metrics for unit isolation. A) Heat map of all waveforms' time-voltage values. B) Distributions of maximum (red), minimum (green), and noise (blue) voltage values of all waveforms. The distributions are well separated, indicating excellent unit isolation. C) Spontaneous firing rate (red, calculated from 2-sec segments in all intertrial intervals) and spike amplitudes (peak-to-peak) as a function of time in session. Both curves were smoothed with a boxcar function (width: 50 data points). D) Interspike-interval distribution for this unit. Bin width, 10 msec (inset: 1 msec). Very short intervals are nearly absent (<0.1% of intervals below 4 msec). E) PSTH triggered to key pecks. Event counts close to the key peck (±20 msec) are highlighted red. F) All 157 waveforms recorded within ±20 msec of key peck events. The waveforms compare favorably to overall waveform shape shown in panel A.
Behavior
Figure 4A shows the behavioral performance of an animal in one example session. The performance level of the animal reaches criterion for NS 2 within 180 trials (45 stimulus presentations) and is close to 100% for the NS 1 from the beginning. This strategy – first responding to the same key for both new stimuli, and then adjusting responses for one of the stimuli – is about as often observed as initial random responding to both NS. In this session, the NS 2 was randomly chosen to undergo extinction, meaning that all choices following this stimulus remain inconsequential (transition between learning stages are indicated by vertical black dotted lines). During extinction, performance decreases for the extinction stimulus but stays high for the other NS. Criterion is reached in trial 370. Correct and incorrect responses are now reinforced and punished again (reacquisition) and performance level reaches criterion in trial 402. Performance level for FS is consistently high (>95%; data not shown). b) Mean number of trials needed to complete each stage of learning (averaged over 5 animals and 44 sessions in total). On average, animals needed ~700 trials to respond consistently respond correctly. Extinction took ~900 trials, and reacquisition merely about 60 trials, substantially less than the original acquisition (Figure 4B).
Figure 4. Example behavioral results. A) One bird's performance for the 2 novel stimuli across all three stages of learning. Curves depict percent correct choices (mean over the last 120 trials, corresponding to 30 presentations of the respective stimulus) as a function of the total number of trials, separately for novel stimulus 1, novel stimulus 2, and averaged across both stimuli. Performance for familiar stimuli was consistently above 95% correct (data not shown). B) Mean number of trials needed to achieve criterion performance in each of the three stages of learning; error bars, SEM.
Neural Data
Figure 5 shows the response pattern of two units in the nidopallium caudolaterale (NCL) recorded while an animal was performing the SIFC task. Response modulation during presentation of the NS is shown in Figure 5A. In the acquisition phase, the units responds strongly to NS 2 (designated for extinction), with responses declining towards the end of the acquisition phase and little change in firing during the other two stages of learning. There is little responding to NS 1 across the entire session. The response increase around 3-4 sec after sample stimulus onset is due to reward delivery. Activity levels concerning familiar stimuli were not modulated (data not shown).
Figure 5B displays the response pattern of another NCL unit recorded during SIFC. This neuron responds during right- but not leftwards movements (upper left), suggestive of sensorimotor coding. However, response strength changed over the stages of learning: the two lowermost panels show spike density functions (SDFs) triggered to rightward choices for one familiar (left) and one novel stimulus (right), split up into successive quartiles to illustrate the development across the experimental session. Responses were lower for the familiar stimulus throughout the entire session, even though average movement times for the two stimulus conditions were highly similar (upper right). Moreover, responses during rightward choices after presentation of the novel but not the familiar stimulus decreased over the course of the experimental session (not paralleled by a decrease in baseline firing rate). Thus, both neurons decreased firing as a particular novel stimulus became increasingly familiar, with the neuron in Figure 5B coding for a specific movement in addition to the novelty of the stimulus preceding that movement.
Figure 5. Response patterns from two example units recorded during the SFIC task. A) Spike density functions triggered to onset of the 2 novel stimuli NS 1 and NS 2 (upper and lower row, respectively), split up for 3 learning phases (columns), with responses in each learning phase again split up in 3 equal parts (early, middle, late). NS 2 was designated for extinction. PSTHs (bin width 1 ms) were smoothed with an exponentially modified Gaussian kernel (σ = 100 msec and τ = 100 msec). B) SDFs from a putative motor neuron. Upper left panel shows SDFs (as in A, but σ and τ equaled 150 msec) triggered relative to left and right choices. Colored vertical dotted lines depict median leaving times for each choice. The two lowermost panels show SDFs for rightward choices following presentation of a familiar (left) or novel stimulus (right). SDFs are constructed separately for 4 equally sized subsets of the data, split up according to time in session. The panel in the upper right shows the distributions of movement times (rightward only), separately for each session quartile and preceding stimulus (familiar, F, novel, N).
This protocol describes a complex behavioral task suitable for concurrent single-unit recordings. We have described the SIFC task for pigeons, but it can be easily adapted to rodents by requiring nose pokes or lever pressing rather than key pecks, and substituting visual by olfactory, auditory, or tactile stimuli.
Perhaps the most critical steps during the training procedure are 1) gradual reduction of reward probability and 2) increase in trial number. Regarding intermittent reinforcement for the familiar stimuli, we decided on reward probabilities ranging from 0.5 to 0.8; these are high enough to produce stable performance but low enough to prevent premature satiation. That said, many birds are willing to perform well for reward probabilities down to 0.2.
The large number of trials per session (500-1,500) is necessary because the acquisition, extinction, and reacquisition of conditioned responding simply requires this many trials, and because the precise estimation of firing rates is difficult with less than, say, 25 trials, especially when recording from neurons with low firing rates (in the NCL, baseline firing rates are <1 Hz). Accordingly, we set the minimum number of trials necessary for completing a learning stage such that each stimulus is shown at least 35 times.
For a naïve animal, training on the SIFC task takes approximately 4 months, but the exact duration depends heavily on the individual. Due to the high demands of the task, it is quite likely that not all animals will end up performing well on the final paradigm. If an individual bird skips too many trials or produces high error rates during training, do not hesitate to replace this subject. In our experience, it is highly probable that this animal will never perform properly on the final task.
Most previous studies conducting single-unit recordings in freely moving pigeons failed to properly register motor output during recording. This complicates the interpretation of neuronal responses during critical periods of the trials, like sample presentation or delay phases25. This problem is inherent in go/no-go tasks in which the experimenter usually does not know what the subject is doing on no-go trials; the same caveat applies to working memory tasks incorporating a prolonged delay period. To achieve control over the movement of the animals without employing head-fixation, we designed a task in which animals have to perform the same action (key pecking) even though conditions (sample stimuli) change. In our SIFC paradigm, both visual input as well as motor output is well-controlled and constantly monitored. Since animals are required to peck at every sample stimulus throughout its presentation, we keep motor output constant while animals are viewing stimuli with different learning histories. We are currently exploring methods to achieve even better control of motor output, such as attaching an accelerometer to the headstage for the continuous registration of head movements. Also, we are developing methods for measuring the force of each key peck by means of a mechanoelectric transducer.
Our paradigm allows disentangling the contributions of sensory, motor, and cognitive variables to neural firing rates by identifying typical neural response patterns. For example, a premotor neuron for leftward responses would be expected to increase firing during the sample phase whenever the animal is going to make a leftward response, regardless of stimulus identity. Similarly, simple motor neurons would be expected to fire during key pecking, or left- or rightward motion. A neuron representing reward expectation, on the other hand, would fire during the sample phase, and more so for the FS than the NS during early acquisition (because subjective reward probability is higher on FS than NS trials before NS are learned), but this should reverse later when the NS are consistently classified correctly (because objective reward probability is higher on correct NS trials). Finally, neurons responsive to specific stimulus features are expected to fire consistently for one of the sample stimuli without any change across stages of learning.
Because extracellular unit recording is prone to record spikes from multiple units at a time11,26, inspecting a range of quality metrics is important to properly classify spikes as originating from a single or from multiple neurons27. Using tetrodes instead of single electrodes would certainly yield an additional increase in sorting quality11. This should be considered when recordings in brain regions with a high cell density (for example hippocampus) or very high spontaneous activity (such as the entopallium) are intended. However, the microplugs we use are only available for up to 18 connections which for now constitute an upper limit on the overall number of recording channels.
In sum, we developed a task of high complexity for non-primate experimental animals. This task was tailored to enable the investigation of learning phenomena with single-neuron recordings, but at the same time is suitable to tackle subjects such as categorization, decision making, and reward coding.
The authors have nothing to disclose.
This research was supported by grants from the German Research Foundation (DFG) to MCS (FOR 1581, STU 544/1-1) and OG (FOR 1581, SFB 874). The website of the DFG is http://www.dfg.de/en/index.jsp. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.The authors thank Thomas Seidenbecher for providing us with the gold-plating protocol as well as Tobias Otto for help with setting up the electrophysiological recording equipment.
Name of Material/ Equipment | Company | Catalog Number | Comments/Description |
Resistance wire (for use as electrodes) | California Fine Wire, Grover Beach (CA), USA | Stablohm 675; formvar-coated nichrome wires (outer diameter 25 µm) | |
Microconnectors | Ginder Scientific, Nepean, Ontario, Canada | GS18PLG-220 (plug) & GS18SKT-220 (socket to build headstage) | |
Cannulae | Henke Sass Wolf, Tuttlingen, Germany | 0.4x20mm/ 27Gx3/4" | |
Gold solution for plating | Neuralynx, Bozeman (MT), USA | SIFCO Process Gold Non-Cyanide, Code 5355 | |
Solution for ultrasonic bath | Alconox, Inc., New York, USA | 1304 | Tergazyme |
Conductive glue | Henkel Loctite | LOCTITE 3888 Silver filled, conductive, adhesive | |
Stainless steel screws | J.I. Morris, Southbridge (MA), USA | F0CE125 self-tapping miniature screws, body length 1/8 inches | |
Light-curing dental cement | van der Ven Dental, Duisburg, Germany | Omniceram Evo Flow A2 | |
Light-curing unit | van der Ven Dental, Duisburg, Germany | Jovident Excelled 215 Curing Light (wireless LED light curing unit) | |
Filter amplifiers | npi electronic GmbH, Germany | DPA-2FS | |
A/D converter | Cambridge Electronic Design, Cambridge, UK | power 1401 | |
Spike2 software | Cambridge Electronic Design, Cambridge, UK | Version 7.06a | |
Matlab | The Mathworks, Natick (MA), USA | R2012a |