This protocol describes steps for using the novel software, SwarmSight, for frame-by-frame tracking of insect antenna and proboscis positions from conventional web camera videos using conventional computers. The free, open-source software processes frames about 120 times faster than humans and performs at better than human accuracy.
Many scientifically and agriculturally important insects use antennae to detect the presence of volatile chemical compounds and extend their proboscis during feeding. The ability to rapidly obtain high-resolution measurements of natural antenna and proboscis movements and assess how they change in response to chemical, developmental, and genetic manipulations can aid the understanding of insect behavior. By extending our previous work on assessing aggregate insect swarm or animal group movements from natural and laboratory videos using the video analysis software SwarmSight, we developed a novel, free, and open-source software module, SwarmSight Appendage Tracking (SwarmSight.org) for frame-by-frame tracking of insect antenna and proboscis positions from conventional web camera videos using conventional computers. The software processes frames about 120 times faster than humans, performs at better than human accuracy, and, using 30 frames per second (fps) videos, can capture antennal dynamics up to 15 Hz. The software was used to track the antennal response of honey bees to two odors and found significant mean antennal retractions away from the odor source about 1 s after odor presentation. We observed antenna position density heat map cluster formation and cluster and mean angle dependence on odor concentration.
Most arthropods move antennae or other appendage to sample environmental cues and signals in time and space. The animals can use the antennae to navigate their environment by detecting sensory cues such as chemical volatiles and gustatory and mechanical stimuli1,2,3,4. In insects, the antennae contain sensory receptors that bind to chemical volatiles4,5,6 and transmit this signal via olfactory sensory neurons to central brain regions1,7,8,9. The insects can adjust antennae positions to modulate information about incoming odors4,10,11. This modulation facilitates actively informed behavioral responses to odors and their plumes12,13.
Many insects, including Hymenopterans (e.g., honey bees and bumblebees), Lepidopterans (e.g., butterflies), and Dipterans (e.g., flies and mosquitoes), among others, feed by extending their proboscis14,15,16,17,18,19,20,21. Proboscis extension has been reliably used in the past for a variety of learning and memory tasks22,23,24,25,26,27,28,29,30,31. Similarly, quantitative assessment of antennae movement with high temporal and spatial resolution might yield insight into the relationship between the stimulus, the behavior, and internal state of the animal. Indeed previous work has shown how the antennal movements contain a rich amount of information about honey bee tracking of the environment and how the movements change with learning32,33,34,35,36,37,38.
In the last decade, methods for observing animal behavior have been greatly accelerated by advances in high-resolution video cameras, computer processing speeds, and machine vision algorithms. Tasks like animal detection, counting, tracking, and place preference analyses have been aided with sophisticated software that can process videos of animal behavior and extract relevant measures39,40,41,42,43,44,45,46,47.
These technologies have also aided tracking of insect antenna and proboscis movements. It is possible for human raters to use a mouse cursor to manually track the position of the antennae. However, while this method can be accurate, the task is time consuming, and human inattention and fatigue can result in unreliable results. Special equipment and preparation can be used to reduce the need for complex software. For example, one setup used a high-speed camera and painted the tips of the antennae to track the antenna movement48. Users can also be asked to select key-frames of videos to assist the software in detecting the antenna and proboscis location49. Another approach detected the two largest motion clusters to identify antennae, but it does not detect the proboscis location50. Another software package can detect antenna and proboscis locations, but requires about 7.5 s of processing time per frame51, which could be prohibitive for real-time or long-term observation studies. Finally, it might be possible to customize commercial software packages (e.g., EthoVision) to perform the task46, but their licensing and training costs can be prohibitive.
With the method described here, we extended our previous work on motion analysis software41 to track the locations of insect antennae and proboscis with the following goals: (1) no requirement for special hardware or complex animal preparation, (2) frame processing at real-time (30 fps or faster) on a conventional computer, (3) ease of use, and (4) open-source, easily extendable code.
The resulting novel method and open-source software, SwarmSight Appendage Tracking, does not require painting of the antennae tips, can use a consumer web camera to capture videos, and processes video frames at 30-60 fps on a conventional computer (Figure 1). The software takes video files as input. The user locates the position of the insect head in the video and, after processing, a comma separated values (.csv) file is produced with the locations of the antennae and proboscis. The software is capable of reading hundreds of different video formats (including formats produced by most digital cameras) through the use of the FFmpeg library52.
Figure 1: Animal Setup and software output. (A) A honey bee forager with its head and body restrained in a harness. (B) Odor source is placed in front of the animal, a video camera is positioned above, and a vacuum source is placed behind the animal. (C) The antenna tip and proboscis variables detected by the SwarmSight software from the video. (D) The user positions the antenna sensor over the animal and adjusts the filter parameters. The software detects the antenna and proboscis positions (yellow rings). Please click here to view a larger version of this figure.
First, an insect's body and its head are restrained in a harness such that the antenna and proboscis movements are easily observed (Figure 1A). An odor source is placed in front of the insect, with a vacuum source placed behind, to remove the odors from the air and minimize potential effects of sensory adaptation (Figure 1B). A conventional web camera is placed above the insect's head on a tripod. An LED can be positioned within the camera's view to indicate when the odor is being presented.
Figure 2: Antenna coordinate system. X, Y values use the video coordinate system, where top left corner is the origin and X and Y values increase when moving towards the bottom right corner. Angles are expressed in degrees with respect to the front of the head (usually the odor source). A "0" value signifies that the line formed by the antenna flagellum is pointing directly in front of the animal. All angles are positive, except when an antenna points to the opposing direction (e.g., right flagellum points to the left). Please click here to view a larger version of this figure.
After filming, the video file is opened with the SwarmSight software, where the user positions the Antenna Sensor widget (Figure 1D, black square) over the head of the insect, and starts the video playback. When the results are saved, the .csv file will contain the X, Y positions of the antenna tips, the antenna angles relative to the front of the head (Figure 2), and the proboscis X, Y position. Additionally, a dominant sector metric is computed for each antenna. The metric shows which of the five 36-degree sectors surrounding each antenna contained the most points deemed likely to be the antennae, and can be useful if the antenna position/angle metrics are not reliable due to noisy or otherwise problematic videos.
Briefly, the software works by using a set of motion filters53 and a relaxed flood fill algorithm54. To find likely antenna points, two filters are used: a 3-consecutive-frame difference filter41,55 and a median-background subtraction56 filter. A color distance threshold filter is used for proboscis point detection. The top 10% of the points of each filter are combined, and a flood fill algorithm that inspects contiguous points with gaps up to 2 pixels (px) locates extreme points. Parallel frame decoding, processing, and rendering pipelines and optimized memory allocation of the filter data flow achieves high performance. The raw x and y coordinate values produced by the software are post-processed with a 3-frame rolling median filter57 (see Discussion). The instructions to download the full source code can be found online58.
Below is a protocol to prepare a honey bee forager for antenna tracking. A similar protocol could be used to track the antenna/proboscis movements of any other insect. In the results section, we describe a sample antenna trace output that is detected by the software, the comparison of the software output to tracking performed by human raters, and assessment of antennae movement in response to five odorants.
1. Catch and Harness Honey Bees
2. Preparing the Animal Harness and Video Camera
3. Film Each Individual under Experimental Conditions
4. Video Analysis
In the sections below are an example plot of antennae angles produced from the data of the software, comparison of the software accuracy and speed with human raters, and the results of an experiment where honey bee antenna movement is affected by presentation of different odors. R software62,63 was used to perform the analysis and generate the figures. R code for analysis and figure generation as well as video tutorials can be found online58.
Software Output:
Figure 3 shows five randomly selected traces of antenna angles detected by the software from videos of honey bees presented with pure and 35x mineral oil diluted versions of heptanal and heptanol, as well as clean air.
Figure 3: Five sample traces of antennae angles detected by SwarmSight. Y-axis shows antenna angle in degrees, where "0" is directly in front of the animal, towards the odor source, with larger values pointing away from the odor source. Heptanol, heptanal, and their 35x mineral oil diluted versions, as well as clean air, were applied during the gray 0 – 3,600 ms windows to single honey bee foragers. Left antenna is marked red, right marked blue. Five random bees, one from each condition, are depicted in the five plots. Please click here to view a larger version of this figure.
Software Validation:
To validate that the software can reliably detect the locations of the antennae, antenna positions located by humans were compared with the positions located by the software. Two human raters were asked to locate the antenna and proboscis tips in 425 video frames (~14 s of video). A custom software module recorded the appendage locations marked by the raters, automatically advanced video frames, and recorded the amount of time spent on the task. As an example of correspondence between human- and software-located values, superimposed vertical coordinate traces of one antenna for the software and for the two human detected locations are shown in Figure 4A. The distance between the two raters' marked antenna positions was computed and named "Inter-Human Distance." The distance between the antenna location detected by the software and the closest location detected by the human raters was computed and named "Software-Closest Human Distance" (Figure 4B).
Figure 4: Comparison with human raters. (A) Two human raters and SwarmSight located antenna tips in 425 video frames. The frame-by-frame left antenna tip Y coordinates found by the human raters and software are superimposed. (B) Superimposed frame-by-frame disagreement (distance in video pixels) between human raters (orange) and disagreement between software and closest human rater value (black). (C) Human vs. human antenna tip locations (orange) and software vs. human locations (black). (D) Histograms and cumulative distributions (dashed) of human vs. human and software vs. human frame-by-frame disagreement distances. Please click here to view a larger version of this figure.
Inter-Human Distance was 10.9 px on average, within 55.2 px in 95% of the frames, and had a maximum value of 81.6 px. The Software-Closest Human Distance was 8.0 px on average, within 18.3 px in 95% of the frames, and had a maximum value of 49.0 px (see distance histograms in Figure 4D, and Figure 4C). 5 px was approximately the width of an antenna. Overall, the Inter-Human Distance was small for the frames at the beginning of the task, and increased in the second half of the task. We suspect this was due to rater fatigue. Meanwhile, Software-Closest Human Distance levels remained constant throughout the task.
Software Speed and Accuracy Comparison with Human Raters:
Humans rated antenna tip and proboscis locations at an average speed of 0.52 frames per second (fps). To estimate human fps, the total number of frames rated by humans (425 each) was divided by the total time they spent on the task (873 s and 761 s). The software rated the frames at 65 fps on average on a Dual-Core Windows 7 PC. Together with high processing speed and accuracy similar to or better than human raters, the software can be expected to perform the work of about 125 human raters per unit of time.
Detection of Antenna Response to Odors:
To demonstrate that the protocol can be used to detect significant behavioral differences in insect movement, we subjected 23 female honey bees to two different odors. Pure heptanal and heptanol, 35x mineral oil dilutions of the two odors, and clean air as the control, were presented each for 4 s (five conditions in total). Videos, as described in the protocol above, were processed with SwarmSight software, and the antenna angles analyzed (Figure 5).
Figure 5: Antenna angle means and density heat maps for five odor conditions. (A) Heat maps showing antenna angle density before, during (darker middle region), and after administration of heptanol, air, and heptanal odorants to female honey bees (n = 23). Black curves are per-frame average antenna angles (both antennae). Horizontal lines are pre-odor mean (baseline) angles. Note the cluster of preferred antenna locations (red cluster in bottom plot) away from the odor source for pure odor conditions, and corresponding changes to the mean antenna angle. Also note the "rebound" cluster after odor conclusion and its apparent onset dependence on odor concentration (see cluster location in the other four plots). Density heat map color scale is arbitrary but uniform across all conditions. (B) Mean angle change from pre-odor mean (error bars S.E.M). Except for air, all mean changes were significant (t-test p <0.05). Please click here to view a larger version of this figure.
Video frames from 9 s segments of the video consisting of 3 s before the odor onset, 3.6 s of odor presentation, and 2.4 s after odor conclusion were aligned across all individuals and conditions (300 frames/segment). The per-frame means of both antenna angles of all individuals were computed for each condition and called "Mean Angles" (Figure 5A, black curves). The mean antenna angles of frames before the odor onset across individuals for each condition were computed and called "Pre-Odor Baselines" (Figure 5A, thin horizontal lines).
In all conditions, except control, the mean angles increased from baselines, each peaking once 750 – 1,050 ms after the odor onset (Figure 5A, black curves in 0 – 3,600 ms region). The mean changes from baselines were tested for significance (Figure 5B) by comparing the two-antenna means of individuals at the peak odor-presentation mean angle time of each condition to the baseline mean using a series of 1-sample t-tests (Shapiro normality tests not significant in all conditions). The mean angle change from baseline was 26.9° for pure heptanal (mean peaked at 750 ms after odor onset), 21.1° for 0.2 M heptanal (at 990 ms), 19.6° for pure heptanol (at 1,050 ms), 19.3° for 0.2 M heptanol (at 780 ms), and 3.45° for air control (no peak). In all conditions, except control, the mean angle change from baseline was significant (Holm adjusted p <0.05). We note that the mean angle takes longer to return to baseline in response to pure odorants than to diluted odorants (low-pass filtered mean returned to baseline 3,690 ms after odor onset for pure and at 2,940 ms for diluted heptanol; for heptanal, return times were 4,260 ms for pure and 3,000 ms for diluted versions).
Visualization Using Heat Maps:
To visualize the antenna responses, antenna angle density heat maps for each condition were generated (Figure 5A, blue-red background). Antenna angles across the 10 s video segments for each individual per condition were convolved with a Gaussian kernel (R package MASS, kde2d function64). Blue areas show low densities of antenna angles, while red areas show high densities of antenna angles. The heat map in the bottom plot of Figure 5A for the pure heptanal condition illustrates the antenna behavior.
The map shows that before the odor is presented (t < 0), the antenna angle density is distributed relatively uniformly across all angles. About 1 s after odor onset (t ~1,000 ms), a pair of blue and red clusters appears. In areas shaded red, the antennae were found more frequently than in areas shaded blue. The blue cluster indicates that antennae tended to avoid smaller angles (odor source was located in the direction of 0 degrees), while the red cluster indicates that antennae preferred greater angles (away from odor source). The red cluster gradually disappears as the odor presentation is maintained. Another red, albeit less intense, cluster appears about 1 s after odor conclusion. We name the second red clusters "Rebound Clusters". Consistent with the mean angle recovery times above, we note that the rebound clusters seem to appear earlier and are less intense for diluted odors than for pure odors.
The method presented here enables real-time tracking of insect antenna and proboscis movements without requiring special animal preparations or hardware.
Limitations:
Despite these advantages, there are some limitations of the method. These include the requirement that the head of the animal is restricted from movement, the need for the user to select the location and orientation of the animal for each video, the requirement to have access to a Windows computer, and the software's inability to track movement in three dimensions (3D) and in some visually ambiguous appendage positions described below.
The software requires that the head of the animal is fixed in place and is not moving during the video. This is similar to the preparations of previous work48,49,50,51. It is possible to modify the software to allow automatic detection of head rotations, however, this would consume additional processing time and introduce a new source of error. If the modified software were to detect the head rotation incorrectly, this would affect the antennae angles, as their computation is relative to the head rotation angle. Currently, the user selects the head orientation once per video. This approach, while not without human error, minimizes angle calculation errors when the head is not allowed to move during the video.
The software also requires a Windows 7 (or later) operating system (OS). The goal was to make the software easy to install, setup, and use by users without programming or sophisticated computer administration skills. We decided to target Windows because it is widely available, and in cases where access to it is limited, virtual machines (e.g., VirtualBox, VMware, Parallels) with Windows can be easily created. This choice of OS greatly simplifies software installation through the use of an easy-to-use, command-line-free installer and avoids bugs specific to different OSs.
The software only tracks the position of the appendages in 2D space. Insects are known to move their antenna in 3D, which could mean that important information is lost when only 2D coordinates are measured. While the use of multiple cameras or mirrors could aid in collecting the additional information required for 3D localization, it is possible to compute, with the use of trigonometric relations, an estimated out-of-plane position by assuming that the antennae are single line segments of constant length and only move on one side of the camera plane. For honey bees, this assumption holds true to obtain rough estimates for the position in 3D, but would not necessarily be the case for other species and situations.
The software will not correctly detect the antennae and proboscis tip locations in some ambiguous situations. If an animal moves an antenna so that, in the video, it overlaps an extended proboscis, the software will likely detect the tip of the antenna as the tip of the proboscis. The antenna angle however, will still likely be computed correctly (from the non-overlapping part). Similarly, if the antenna tips move directly above the head of the animal (i.e., not on the sides) then the software might only detect the part of the antenna that is visible outside of the head, or assume the previous location of the antenna, or detect spurious video noise as antenna location. In both situations, even human raters have difficulty discerning the antenna from the proboscis or the head. To mitigate this problem, we recommend applying a 3-frame, symmetric rolling median57 filter to the raw X and Y coordinates produced by the software. This filter removes large transient (single-frame) position fluctuations, and preserves longer antenna position movements. We have found that the 3-frame filter performed better than no filter, while wider filters (e.g., 5, 11, or 15 frames) reduced accuracy. Example R code that uses the filter and a video tutorial can be found online58.
Value as a Scientific Tool:
The availability of a method to rapidly obtain accurate insect appendage movements in a cost-effective manner has the potential to open up new areas of investigation.
Proboscis extension reflex (PER) is a commonly used behavioral response to investigate learning and memory of a variety of insects59. Previous studies have generally relied on a binary extended-or-not measure of PER, although video and electromyographic analyses have shown much more complex topologies to proboscis movements65,66. The method here allows rapid quantification of proboscis movements in high temporal and spatial resolutions.
Insect antenna movements in response to odors are poorly understood. One reason for this is that the antennae tend to move so rapidly that a cost-effective, automated means to obtain antenna movement data has not been available. The method presented here could be used to rapidly obtain antenna movement data for large numbers of insects in a large number of conditions. This could aid, for example, researchers investigating the mapping between antenna movements in response to various stimuli, in particular volatile odors. Using cameras that capture frames at 30 Hz, the software can be used to characterize antennal movement dynamics up to 15 Hz (Nyquist limit). If characterization in higher frequencies is needed, cameras with higher capture rates (e.g., 60 or 120 fps) could be utilized. However, a faster computer may be required to process higher fps videos in real-time. We speculate that classes of odors, and possibly even some individual odors, have characteristic innate antennal movements. If those classes or compounds could be discovered, unknown odors or their class could be detected from antennal movement of untrained insects. If such a mapping exists, then the combination of sufficient antenna movement data and state of the art machine learning algorithms should begin to uncover it. Also, how that mapping changes in response to learning, forms during development, or is disrupted with genetic interventions could offer insight into functions of the olfactory system. Finally, this work could give insight into artificial detection of odors if it reveals optimal sampling methods for odors in complex environments.
Future Work:
Here, we showed that antenna movement data can be rapidly obtained and analyzed: significant behavior responses can be detected from the data generated by our software, and several areas of further investigation were identified.
The time courses of stimulus-elicited antenna angle deviations from and recovery to baseline and any stimulus-conclusion rebound effects and its dependence on odor concentration can be investigated and modeled mathematically. Additionally, any changes of antenna movements induced by appetitive or aversive conditioning also can be assessed with the software.
Better differentiation of odors can also be explored. In this study, both odors, in pure and 35x diluted versions elicited similar responses: the antennae, on average, appeared to rapidly withdraw away from the odor source and return to pre-odor baselines after a few seconds. We speculate that even the diluted versions may have been very strong olfactory stimuli for the honey bees. If true, a broader range of concentrations could be used to determine if the antennal responses differentiate the odors. Additionally, more sophisticated analysis may better reveal differences in antennal movements in response to different odors. We have made the data files used to generate figures in this manuscript available to interested researchers on the SwarmSight website67.
Furthermore, while outside the scope of this manuscript, the software could be extended to process videos of animals placed in chambers with dual mirrors angled at 45° (see Figure 1D for example). This could be used to accurately localize and track the appendages and their movement in 3D space. However, the algorithms for 3D tracking would be required to efficiently: (a) disambiguate between multiple antennae when they are visible in one of the side mirrors, (b) correct for imperfections in mirror angles, and (c) account for distortions due to camera positioning.
Finally, additional gains in position accuracy might be realized via the use of a Kalman filter68, which models and utilizes physical state information such as appendage velocity and acceleration to constrain predicted locations. However, any gains in accuracy should be evaluated against any reductions in speed due to additional computations.
Conclusion:
Many insects use antennae to actively sample volatile compounds in their local environments. Patterns in antennal movements may provide insight into insect odor perception and how it is affected by conditioning, toxic compounds, and genetic alterations. Similarly, proboscis movements have been used to assess odor perception and its modulation. However, rapidly obtaining large quantities of high-resolution appendage movement data has been difficult. Here, a protocol and software is described that automates such task. In summary, we have created and demonstrated how the combination of inexpensive hardware, a common animal preparation, and the open-source software can be used to rapidly obtain high-resolution insect appendage movement data. The output of the software, how it outperforms human raters in speed and accuracy, and how its output data can be analyzed and visualized were shown.
The authors have nothing to disclose.
JB, SMC, and RCG were supported by NIH R01MH1006674 to SMC and NIH R01EB021711 to RCG. CMJ and BHS were supported by NSF Ideas lab project on "Cracking the olfactory code" to BHS. We thank Kyle Steinmetz, Taryn Oboyle, and Rachael Halby for their assistance in conducting this research.
Insect harness | N/A | N/A | Use materials needed for Protocol sections 1-3.1.1 of Smith & Burden (2014) |
Odor delivery source | N/A | N/A | Use materials needed for Protocol section 3 of Smith & Burden (2014) |
Vacuum source | N/A | N/A | Use materials needed for Protocol section 3 of Smith & Burden (2014) |
LED connected to odor delivery source | N/A | N/A | Use materials needed for Protocol section 3 of Smith & Burden (2014) |
Low Voltage Soldering Iron | Stannol | Low Voltage Micro Soldering Iron 12V, 8W | |
DC Power Supply | Tekpower | HY152A | |
White sheet of paper | Georgia-Pacific | 998606 | Any white sheet of paper can be used as alternative |
Tripod | AmazonBasics | 50-Inch Lightweight Tripod | Optional |
Camera | Genius | WideCam F100 | FLIR Flea3 or another camera with manual focus can be used. |
Camera software | Genius | N/A | Software comes with camera. On MacOS, Photo Booth app can be used to record videos. |
Camera shutter speed software | Genius | N/A | Genius camera software allows shutter speed setting. In Mac OS, iGlasses by ecamm can be used instead: http://www.ecamm.com/mac/iglasses/ |
Windows Operating System | Microsoft | Windows 7 Professional | Versions 7 or later are compatible. Oracle VirtualBox, Parallels Desktop, or VMWare Fusion can be used to create a Windows virtual machine in MacOS environments. |
SwarmSight software | SwarmSight | Appendage Tracking | Download from http://SwarmSight.org |
R software | R Project | R 3.4.0 | Download from: https://cran.r-project.org/bin/windows/base/ |
R Studio software | RStudio | RStudio Desktop | Download from: https://www.rstudio.com/products/rstudio/download/ |