We describe methods for presenting real-world objects and matched images of the same objects under tightly-controlled experimental conditions. The methods are described in the context of a decision-making task, but the same real-world approach can be extended to other cognitive domains such as perception, attention, and memory.
Our knowledge of human object vision is based almost exclusively on studies in which the stimuli are presented in the form of computerized two-dimensional (2-D) images. In everyday life, however, humans interact predominantly with real-world solid objects, not images. Currently, we know very little about whether images of objects trigger similar behavioral or neural processes as do real-world exemplars. Here, we present methods for bringing the real-world into the laboratory. We detail methods for presenting rich, ecologically-valid real-world stimuli under tightly-controlled viewing conditions. We describe how to match closely the visual appearance of real objects and their images, as well as novel apparatus and protocols that can be used to present real objects and computerized images on successively interleaved trials. We use a decision-making paradigm as a case example in which we compare willingness-to-pay (WTP) for real snack foods versus 2-D images of the same items. We show that WTP increases by 6.6% for food items displayed as real objects versus high-resolution 2-D colored images of the same foods -suggesting that real foods are perceived as being more valuable than their images. Although presenting real object stimuli under controlled conditions presents several practical challenges for the experimenter, this approach will fundamentally expand our understanding of the cognitive and neural processes that underlie naturalistic vision.
The translational value of primary research in human perception and cognition hinges on the extent to which the findings transfer to real-world stimuli and contexts. A long-standing question concerns how the brain processes real-world sensory inputs. Currently, knowledge of visual cognition is based almost exclusively on studies that have relied on stimuli in the form of two-dimensional (2-D) pictures, usually presented in the form of computerized images. Although image interaction is becoming increasingly common in the modern world, humans are active observers for whom the visual system has evolved to allow perception and interaction with real objects, not images1. To date, the overarching assumption in studies of human vision has been that images are equivalent to, and appropriate proxies for, real object displays. Currently, however, we know surprisingly little about whether images effectively trigger the same underlying cognitive processes as do the real objects. Therefore, it is important to determine the extent to which responses to images are like, or different from, those elicited by their real-world counterparts.
There are several important differences between real objects and images that could lead to differences in how these stimuli are processed in the brain. When we look at real objects with two eyes, each eye receives information from a slightly different horizontal vantage point. This discrepancy between the different images, known as binocular disparity, is resolved by the brain to produce a unitary sense of depth2,3. Depth cues derived from stereoscopic vision, together with other sources such as motion parallax, convey precise information to the observer about the object’s egocentric distance, location, and physical size, as well its three-dimensional (3-D) geometric shape structure4,5. Planar images of objects do not convey information about the physical size of the stimulus because only the distance to the monitor is known by the observer, not the distance to the object. While 3-D images of objects, such as stereograms, approximate more closely the visual appearance of real objects, they do not exist in 3-D space, nor do they afford genuine motor actions such as grasping with the hands6.
The practical challenges of using real object stimuli in experimental contexts
Unlike studies of the image vision in which stimulus presentation is entirely computer-controlled, working with real objects presents a range of practical challenges for the experimenter. The position, order, and timing of object presentations must be controlled manually throughout the experiment. Working with real objects (unlike images) can involve a significant time commitment due to the need to collect7,8,9 or make10 the objects, set up the stimuli prior to the experiment, and present the objects manually during the study. Moreover, in experiments that are designed to compare, directly, responses to real objects with images, it is critical to match closely the appearance of the stimuli in the different display formats8,9. Stimulus parameters, environmental conditions, as well as randomization and counterbalancing of real object and image stimuli, must all be controlled carefully to isolate causal factors and rule out alternative explanations for the observed effects.
The methods detailed below for presenting real objects (and matched images) are described in the context of a decision-making paradigm. The general approach can be extended, however, to examine whether stimulus format influences other aspects of visual cognition such as perception, memory or attention.
Are real objects processed differently to images? A case example from decision-making
The mismatch between the kinds of objects that we encounter in real-world scenarios versus those examined in laboratory experiments is especially apparent in studies of human decision-making. In most studies of dietary choice, participants are asked to make judgments about snack foods that are presented as colored 2-D images on a computer monitor 11,12,13,14. In contrast, everyday decisions about which foods to eat are usually made in the presence of real foods, such as at the supermarket or the cafeteria. Although in modern life we regularly view images of snack foods (i.e., on billboards, television screens and online platforms), the ability to detect and respond appropriately to the presence of real energy-dense foods may be adaptive from an evolutionary perspective because it facilitates growth, competitive advantage, and reproduction15,16,17.
Research outcomes in scientific studies of decision-making and dietary choice have been used to guide public health initiatives aimed at curbing rising obesity rates. Unfortunately, however, these initiatives appear to have met with little to no measurable success18,19,20,21. Obesity remains a major contributor to the global burden of a disease22 and is linked to a range of associated health problems, including coronary heart disease, dementia, Type II diabetes, certain cancers, and increased overall risk of morbidity22,23,24,25,26,27. The sharp rise in obesity and associated health conditions over recent decades28 has been linked with the availability of cheap, energy-dense foods18,29. As such, there is an intense scientific interest in understanding the underlying cognitive and neural systems that regulate everyday dietary decisions.
If there are differences in the way foods in different formats are processed in the brain, then this might provide insights into why public health approaches to combating obesity have been unsuccessful. Despite the differences between images and real-world objects, described above, surprisingly little is known about whether images of snack foods are processed similarly to their real-world counterparts. In particular, little is known about whether or not real foods are perceived to be more valuable or satiating than matched images of the same items. Classic early behavioral studies found that young children were able to delay gratification in the context of 2-D colored images of snack foods30, but not when they were confronted with real snack foods31. However, few studies have examined in adults whether the format in which a snack food is displayed influences decision-making or valuation12,32,33 and only one study to date, from our laboratory, has tested this question when stimulus parameters and environmental factors are matched across formats7. Here, we describe innovative techniques and apparatus for investigating whether decision-making in healthy human observers is influenced by the format in which the stimuli are displayed.
Our study7 was motivated by a previous experiment conducted by Bushong and colleagues12 in which college-aged students were asked to place monetary bids on a range of everyday snack foods using a Becker-DeGroot-Marschak (BDM) bidding task34. Using a between-subjects design, Bushong and colleagues12 presented the snack foods in one of three formats: text descriptors (i.e., 'Snickers bar'), 2-D colored images, or real foods. Average bids for the snacks (in dollars) were contrasted across the three participant groups. Surprisingly, students who viewed real foods were willing to pay 61% more for the items than those who viewed the same stimuli as images or text descriptors -a phenomenon the authors termed the 'real-exposure effect'12. Critically, however, participants in the text and image conditions completed the bidding task in a group setting and entered their responses via individual computer terminals; conversely, those assigned to the real food condition performed the task one-on-one with the experimenter. The appearance of the stimuli in the real and image conditions was also different. In the real food condition, the foods were presented to the observer on a silver tray, whereas in the image condition the stimuli were presented as scaled cropped images on a black background. Thus, it is possible that participant differences, environmental conditions, or stimulus-related differences, could have led to inflated bids for the real foods. Following from Bushong, et al.12, we examined whether the real foods are valued more than 2-D images of food, but critically, we used a within-subjects design in which environmental and stimulus-related factors were carefully controlled. We developed a custom-designed turntable in which the stimuli in each display format could be interleaved randomly from trial to trial. Stimulus presentation and timing were identical across the real object and image trials, thus reducing the likelihood that participants could use different strategies to perform the task in the different display conditions. Finally, we controlled carefully the appearance of the stimuli in the real object and image conditions so that the real foods and images were matched closely for apparent size, distance, viewpoint, and background. There are likely to be other procedures or mechanisms that could allow for randomizing stimulus formats across trials, but our method allows for many objects (and images) to be presented in relatively rapid interleaved succession. From a statistical standpoint, this design maximizes power to detect significant effects more so than is possible using between-subjects designs. Similarly, the effects cannot be attributed to a-priori differences in willingness-to-pay (WTP) between observers. It is, of course, the case that in within-subjects designs open the possibility for demand characteristics. However, in our study participants understood that they could 'win' a food item at the end of the experiment regardless of the display format in which it appeared in the bidding task. Participants were also informed that arbitrarily reducing bids (i.e., for the images) would reduce their chances of winning and that the best strategy for winning the desired item is to bid one’s true value34,35,36. The aim of this experiment is to compare WTP for real foods versus 2-D images using a BDM bidding task34,35.
The experimental protocols were approved by the University of Nevada, Reno Social, Behavioral, and Educational Institutional Review Board.
1. Stimuli and Apparatus
Figure 1: Real object (displayed on the turntable) and matched 2-D image of the same item (displayed on a computer monitor). The stimuli in this experiment consisted of 60 popular snack food items. The real foods (left panel) were photographed on the turntable and their resulting 2-D images (right panel) were matched closely for the apparent size, distance, viewpoint, and background. Please click here to view a larger version of this figure.
Figure 2: Schematic showing turntable components and assembly. (A) Major components of the turntable device and their relative positioning. (B) Assembled turntable apparatus with 20 individual cells. A real object can be placed inside each cell. The vertical dividers prevent participants from viewing items in neighboring cells. Please click here to view a larger version of this figure.
Figure 3: How to set-up and use the turntable apparatus for testing. (A) Setup of the turntable apparatus ready for testing. Once the turntable has been assembled it should be placed on a table at a comfortable height for a seated participant. A vertical partition should be created and placed between the participant and the turntable. Within the partition, there should be a viewing aperture. A 'participant monitor' is used for viewing the 2-D images. The LCD monitor should be positioned behind the vertical partition and viewing aperture, and in front of the turntable. The monitor is mounted on a sliding platform that allows it to move in and out of the participant's view across trials. An 'experimenter monitor', which is placed out of view of the participant, is used to inform the experimenter of which stimulus to present on upcoming trials. (B) View of the apparatus and a real object stimulus from the participants' perspective. Only one food item should be visible to a participant at a time. A keyboard tray should be attached to the desk directly in front of where the participant is seated. Participants make responses with a computer mouse. (C) Side view showing the participant monitor mounted on the sliding platform. For image trials, the experimenter slides the participant monitor into the viewing aperture. The participant monitor is retracted behind the vertical partition on real object trials. (D) Aerial schematic showing the setup of the turntable apparatus. A single real object can be placed in each of the 20 cells of the turntable. The participant should be seated in front of the viewing aperture while wearing the computer-controlled visual occlusion glasses. The experimenter can view upcoming trials on the experimenter monitor and manually rotate the turntable, or move the participant monitor, as necessary. Panel C of this figure has been reprinted from reference7 with permission from Elsevier. Please click here to view a larger version of this figure.
2. General Procedure: Randomization and Design
3. Procedure for Randomization and Design
Figure 4: Experimental design for the current study. The experiment consisted of 4 phases: (1) food preference- and familiarity-rating task, (2) bidding task, (3) food auction, (4) in-lab waiting period. Participants first complete either a preference- or familiarity-rating task (counterbalanced across participants). In the preference task, participants viewed an image of each snack food item for 3 s and then rate how much they liked the item (using a -7 to 7 rating scale) using a sliding analog bid bar. For the familiarity rating task, participants indicated how familiar they were with the item (using a 0 to 3 rating scale). Next, participants completed a bidding task in which they rated how much they were willing to pay ($0-$3) for each snack food item. Half of the stimuli were presented as real foods and half were presented as 2-D images. Viewing time on each trial was controlled using computer-controlled visual occlusion glasses. At the start of the trial, the glasses transitioned to the 'open' (transparent) state for 3 s, before returning to the 'closed' (opaque) state for a 3 s inter-trial interval. The spectacles then opened to allow the participant to record a response. Once the bidding task had been completed, an 'auction' was conducted to determine whether a participant 'won' a food item, and at what price. The auction was followed by a mandatory 30 min waiting period in the lab. If the participant won a food item, they could consume the food during the waiting period. All participants were asked to remain in the lab for the waiting period whether or not a food item was won during the auction. This figure has been reprinted from reference7 with permission from Elsevier. Please click here to view a larger version of this figure.
4. Participant Screening and Scheduling
5. Questionnaire Procedure
6. Preference- and Familiarity Rating Task Procedure
7. Bidding Task Procedure
8. Food Auction/ 30 Min Waiting Period Procedure
9. Calorie Estimation Procedure
10. Data Analysis
Representative results from this experiment are presented below. A more detailed description of the results, together with a follow-up study, can be found in the original publication7. We used a linear mixed effects model with the dependent variable of Bid, and independent variables of Display Format, Preference, Caloric Density, and Estimated Calories. As expected, and in line with previous studies12,14, there was a strong positive relationship between Preference ratings and Bids (F(1,1655) = 1803.69, p < .001) such that a one unit increase in Preference was associated with an increase of $0.15 in Bid value (β = .15, t(1655) = 42.47, p < .001; d = 8.03). There was also a significant main effect of Caloric Density on Bids (F(1, 1649) = 6.87, p < .01). A one unit increase in Caloric Density was associated with an increase of $.024 in Bids (β = .024, t(1649) = 2.62, p < .01; d = 0.50). The main effect of Estimated Calories was also significant (F(1, 1672) = 6.88, p < .01)11. A one unit increase in Estimated Calories was associated with an increase of $.009 in WTP (β = .009, t(1671) = 2.62, p < .01; d = .50). In other words, observers rated foods that were perceived to be of greater caloric content to be more valuable than foods of lower caloric content. Critically, after controlling for all other factors, we found a significant main effect of Display Format (F(1, 1645) = 7.99, p < .01, d = .53) in which there was a 6.62% increase in Bids for real foods versus food images. The amplification in WTP for real foods (vs. images) was relatively consistent across participants, with 20 out of 28 participants showing the effect. For illustrative purposes, Figure 5 displays average Bid values for each snack food item as a function of Preference, separately for foods shown as real objects (red) and images (blue). Similarly, Figure 6 displays average Bid values for each snack food as a function of Caloric Density, separately for foods in each Display Format. The amplification in WTP for real foods vs. images is evident in both Figure 5 and Figure 6. Importantly, the effect of Display Format on bids was constant across food Preference (F(1, 1644) = .025, p = .88), Caloric Density (F(1, 1643) = 2.54, p = .11) and Estimated Calories (F(1,1643) = .11, p = .74), and there were no significant higher-order interactions between any other factors (all p-values ≥ .11).
Although we observed an effect of estimated calories on Bids, the effect was relatively weak. This result may be explained by the fact that participants performed the estimation task in response to text prompts after the main experiment, rather than while looking at the foods at the time of stimulus presentation. Furthermore, estimating the number of calories in a given food item is not necessarily an intuitive task; many observers are unaware (or do not pay attention to) the caloric density of the foods they consume.
Figure 5: Average monetary bids for each snack food plotted as a function of preference and display format. As expected, there was a strong positive relationship between monetary bids and food preference ratings, with higher bids for foods that were more strongly liked. Importantly, there was a significant main effect of Display Format in which bids for real foods were greater than matched food images. There was no significant interaction between the effect of display format and preference. Mean bid values ($) for the foods are displayed separately for the real foods (red) and 2-D images (blue). Each data point represents the group average bid for each food item, separately for foods in each display format. Solid red and blue lines represent lines of best fit for the real object and image conditions, respectively. This figure has been reprinted from from reference7 with permission from Elsevier. Please click here to view a larger version of this figure.
Figure 6: Average monetary bids for each snack food plotted as a function of caloric density and display format. We found a significant positive relationship between bids and actual caloric density, with higher bids for foods of higher caloric density. There was no significant interaction between the effect of display format and caloric density. Mean bid values ($) for the foods are displayed separately for the real foods (red) and 2-D images (blue). Each data point represents the group average bid for each food item, separately for foods in each display format. Solid red and blue lines represent lines of best fit for the real object and image conditions, respectively. This figure has been reprinted from from reference7 with permission from Elsevier. Please click here to view a larger version of this figure.
The overarching goal of the current paper is to facilitate future studies of 'real world' object vision by providing detailed information about how to present large numbers of real-world objects (and images) under controlled experimental conditions. We present an ecologically-valid approach for studying the factors that influence dietary choice and food valuation. We describe methods employed in a recent study of human decision-making7 in which we examined whether snack foods presented in the form of real-world objects are valued differently to foods presented as 2-D images. In our experiment7, hungry college students placed monetary bids on a range of everyday snack foods. Using a within-subjects design, half of the stimuli were presented to each observer as real foods and the remainder were presented as high-resolution colored 2-D photographs of foods. The real foods and food images were matched closely for apparent size, distance, background, viewpoint, and illumination. In an important departure from previous studies7, environmental conditions and stimulus timing were identical across the different display formats. The order of trials in each display format was randomized throughout the experiment using a custom-built turntable device. At the start of the testing session, participants rated their preference for, and familiarity with, sixty different appetitive snack foods (presented as images). In the main experiment, observers indicated their willingness-to-pay (WTP) for each of the sixty foods which were displayed either as real objects or 2-D images. Assignment of food items to the real object or image conditions was counterbalanced across observers. Following from an earlier study that addressed a similar question12, we measured WTP using a Becker DeGroot Marschak (BDM)35 bidding task in which observers entered a monetary bid ($0-$3) for each snack food to ‘win’ the opportunity to consume a food at the end of the experiment. Given the nested structure of the data, we used linear mixed effects modeling to determine the extent to which WTP was influenced by display format, food preference, caloric content, and estimated calories. We found that observers were willing to pay 6.62% more for foods displayed as real objects versus food images7. The amplification in value for real food displays was consistent across all levels of food preference, as well as across actual and estimated caloric content of the foods. These results are surprising because participants knew that they could receive the same (real) snack food reward at the end of the experiment regardless of the format in which the food was presented during the bidding task. Importantly, the findings confirm that there is a reliable 'real-food exposure effect' on willingness-to-pay7,12 that cannot be accounted for by differences in environmental context, stimulus presentation method, or trial timing across display formats.
In summary, we have provided detailed methods that describe how to prepare real object stimuli and closely-matched 2-D computerized images of the same items, as well as methods for creating a manually-operated turntable for presenting large numbers of real objects and images in interleaved succession. We provided instructions for controlling stimulus presentation and viewing time across all trials, for example, by using computer-controlled display glasses. The methods presented here open up new avenues to examine the underlying mechanisms for the observed effects. For example, future studies could assess directly the impact of stereopsis by presenting real-world stimuli under monocular viewing conditions (which could, for example, be tested easily using monocular vs. binocular states of the computer-controlled glasses described here). This would form a nice comparison with the image-based trials in which both motion parallax and stereopsis provide conflicting depth information.
Although we have offered practical solutions for presenting real-world objects under controlled viewing conditions, working with real objects in the laboratory is undeniably challenging, costly, and time-consuming. In addition to the technicalities associated with controlling stimulus parameters such as illumination, position, size and timing, collection and careful preparation (i.e. mounting) of real object stimuli can be painstakingly slow compared to the time that would be required to prepare images alone. The experimenter(s) must be well-practiced with locating the correct exemplars prior to each trial within required time limits and there are obvious possibilities for experimenter error. In some cases where trial numbers are limited, such as in fMRI8,39 and patient10 studies of real-object vision, we use a video camera to record which exemplars were presented on each trial and the recordings are cross-checked post-hoc for accuracy. There are additional challenges with working with foods, which are perhaps a unique class of real object stimuli. Depending on the number of items used in the study, a relatively large selection of foods must be kept fresh, on-hand, and in relatively close proximity to the testing room. In decision-making paradigms involving foods, the stimuli are typically shown with the packaging opened and some of the contents visible. Although many manufactured foods seem to have an indefinite shelf life (i.e., the Twinkie) most items need to be replaced regularly to maintain freshness and visual appeal. Together, these conditions make it difficult to control exactly the appearance of the foods between real and image formats to the degree that we have found is possible with non-perishable stimulus classes, such as objects and tools. It is also important to note that we modified our turntable apparatus from the way it appeared in the original study7 (black) to the way it is depicted here (white) because we found that the white apparatus was easier to clean and stimulus contrast was improved.
The above considerations raise the critical question of whether or not the time and resource costs of working with real objects are justified, or whether similar results can be obtained using more convenient image displays. The results from our decision-making paradigm7 indicate that real food displays elicit a constant increase in valuation (i.e., a linear effect) that does not interact with other factors such as preference or caloric density. These results from decision-making dovetail with findings from other domains of human cognition. For example, real-world objects are more easily recognized10,40,41, enhance memory42, and capture attention43,44 more so than images do. Compared to 2-D images, fMRI repetition suppression effects are reduced for real objects8. Similarly, fine-grained examination of the temporal dynamics of brain responses to real objects as measured by high-density EEG reveals that real objects (vs. images) elicit stronger and more prolonged desynchronization of the mu rhythm -a signature of activation in visuo-motor networks involved in automatic planning of motor actions9. The amplification in mu desynchronization for real objects is independent of early signal differences related to stereopsis9. Taken together, these findings suggest that the pattern of results that could be obtained using image displays may be broadly consistent, but just less compelling, than what might otherwise have been observed had real-world objects been used. In other words, if findings from studies of image vision transfer predictably to real object vision, then the translational value of basic research studies of image vision is preserved. Although there is currently insufficient data to make firm conclusions on this issue, recent evidence for dissociations in the effects of real objects across motor areas in the left versus right hemispheres9 and across egocentric distances6 raise concerns about this assumption. For example, the effect of real objects on attentional capture fall to the levels observed for 2-D and 3-D images when the objects are positioned outside of reach of the observer, or when they are within reach but behind a transparent barrier6, suggesting that the potential for manual interaction with a real object (but not an image) determines how it is processed. Future studies could use the protocols described here to investigate whether similar underlying causal mechanisms modulate ‘real-food exposure effects' on willingness-to-pay. For example, a distance or barrier manipulation6 could be employed to determine whether real snack foods that are reachable or graspable are processed differently to those that are not (and to determine whether the same manipulation has any effect on processing of food images). Future studies using ecologically-valid real-object stimuli are required to make definitive conclusions on this issue. Importantly, it may not be the case that similar mechanisms are at play in different cognitive domains, or in different tasks. Nevertheless, our approach to working with real-world objects promises to provide important new insights into the underlying processes and mechanisms that drive naturalistic vision.
The authors have nothing to disclose.
This work was supported by grants to J.C. Snow from the National Eye Institute of the National Institutes of Health (NIH) under Award Number R01EY026701, the National Science Foundation (NSF) [grant 1632849] and the Clinical Translational Research Infrastructure Network [grant 17-746Q-UNR-PG53-00]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH, NSF or CTR-IN.
EOS Rebel T2i Body Camera | Canon | 4462B001 | |
MATLAB | MathWorks | R2017b | Computer programming software. Download this additional free toolbox: PsychToolbox 3.0.14 |
Photoshop | Adobe | CS6 | |
PLATO Visual Occlusion Glasses | Translucent Technologies Inc. | N/A | |
SPSS | IBM | Version 22 | Statitical analysis software |
ToTaL Control System (USB) | Translucent Technologies Inc. | N/A | The ToTaL Control System controls the PLATO spectacles |