This article presents a new conceptualization of the in-store search process, the 3S Model, which captures customers’ visual attention at three distinct levels of analysis: Stock, Shelf, and Store. We illustrate the usefulness of our conceptualization through three eye-tracking studies, one from each level of analysis in the 3S Model.
Several models of the in-store search process exist in the fields of retailing, marketing, and consumer-based research. The present article presents a new conceptualization of this search process, which captures customers’ visual attention at three distinct levels of analysis: Stock, Shelf, and Store. We refer to this conceptualization as the 3S Model and illustrate its usefulness through three eye-tracking studies, one from each level of analysis. Our experimental examples, which range from manipulating certain stimuli on a single product (e.g., the placement of textual and pictorial packaging elements) to manipulating the entire shopping trip for customers during their stay in a store (e.g., through more or less specific shopping tasks), highlight the broad applicability of this alternative approach for understanding customers’ in-store search behavior. Thus, our model can be seen as a helpful tool for researchers interested in how to conduct experimental eye-tracking studies that shed light on the perceptual processes preceding product choices and purchase decisions. The 3S Model is equally suitable in controlled lab conditions and under ecologically valid settings in the real retail environment. Furthermore, it can be used from the micro level, with a focus on the meaningful metrics on a particular product, through the intermediate level, with the emphasis on the area surrounding products in shelves and other in-store spaces, all the way to the macro level, examining customers’ navigational paths throughout a store as a function of their shopping tasks, cognitive capacity, or ability to acquire in-store information.
Several models of the in-store search process have been offered throughout the years in the fields of retailing, marketing, and consumer behavior. Common conceptualizations of this search process are dichotomizations into navigation and decision-making1 or into movement and contact2, respectively, in which customers move and navigate inside the store to reach a desired area where they finally decide which particular items to purchase or interact with employees helping them to make informed choices. While we see the value of such conceptualizations, they do not really capture the various layers of the retail environment and their influence on customers’ in-store search behavior.
Thus, the aim of this article is to present an alternative model of the in-store search process, hereinafter referred to as the 3S Model, which captures and discusses customers’ visual attention at three distinct levels, from micro to macro: Stock, Shelf, and Store. According to our conceptualization, the Stock level represents a particular product for sale (i.e., a stock keeping unit; SKU), and includes customers’ visual attention towards specific stimuli on product packaging, such as brand logos, textual elements, and pictorial elements. Next, at the intermediate Shelf level, the focus is not on elements located on specific products or SKUs, but rather on the area surrounding such units and how the configuration of that area can influence customers’ visual attention. Apart from shelves and shelf layout, this level also includes, for example, in-store displays and another point-of-purchase material. Finally, the Store level represents the entire store environment, everything included, with SKUs and shelves both acting as building blocks. The emphasis on this last level is to shed light on customers’ movements and navigation throughout the store, depending on their specific shopping tasks, cognitive capacity, and ability to acquire information.
In what follows, we give three eye-tracking examples, each from one of the above-mentioned S-levels, that jointly illustrate how the in-store search process can be understood and studied from the Stock, Shelf, and Store levels with respect to the suitable research topics, methodologies, and analyses.
Stock Level
Previous research has argued for two distinct views on how to organize textual and pictorial packaging elements in the best possible way: one based on recall3 and the other based on preference4,5,6. According to the recall view, the optimal packaging design should be to locate textual elements on the right side of a package and pictorial elements on the left side, since people tend to recall these element types better when located in such a way3. In contrast, the preference view postulates that it should be more advantageous to locate textual elements on the left side of a package and pictorial elements on the right side, since people prefer such an element organization and find it more aesthetically appealing5,6. While recall and preference are both important variables influencing consumer choice, these variables do not provide any insights regarding how a package should be designed for customers to quickly detect its different packaging elements. This is important since choices of packaged products depend on whether the products can capture customers’ attention and convey an adequate message within a very limited time7,8,9,10. Our previous publication 11, therefore aimed to examine how the placement (left vs. right) of textual and pictorial packaging elements influences detection time toward these element types.
Shelf Level
Shelf space in the retail environment is an important strategic tool that increases the possibility for products to be seen and sold12. Valenzuela and Raghubir13 showed that premium products tend to be located on the top of the shelf and budget products on the bottom. Hence, experiences from the store’s positioning scheme helps to form consumer beliefs about vertical spatial position. Supporting this notion, the same authors later showed that vertical positioning is a diagnostic cue used by the customer in value judgment, with products on the top perceived to have higher value than products on the bottom14. Although they did not specifically test the influence of spatial beliefs on information processing, they argued that beliefs about vertical positioning reflect heuristic rather than systematic processing. This anticipated relationship indicates that value judgments made from vertical positioning are fast and frugal15. Heuristic processing of spatial information suggests that customers’ visual attention will be guided towards vertical positions believed to contain a certain value. Therefore, if a customer is looking for a premium product, visual attention should be guided upwards, independent of whether premium products are placed on the top vertical position or not. Consequently, if vertical positioning is a diagnostic cue in value judgment, then visual attention towards vertical shelf levels should vary depending on activated beliefs and independent of actual content16. The aim was to explore how beliefs about spatial positioning (e.g., expensive is up and cheap is down) influences customers’ visual search for premium and budget alternatives, respectively.
Store Level
A visit to a store usual entails making a series purchase decision. It is, therefore, important to investigate the purchase decisions as a part of a larger task in addition to investigating the process of a single decision. Previous research on customer decision-making has shown that customers make product choices in a matter of seconds7 from a very small subset of all available products17,18. It is noteworthy that even though customers arrive at the store with their experiences, preferences, and shopping goals, it has been estimated that 80% of the purchase decision are made in the store during the shopping trip19. It has been proposed that this process is an example of the efficacy of using heuristic decision strategies20. There are a few studies investigating the visual process of choosing a product from a single shelf21 but they have not looked at a decision as a part of a greater whole and to what extent one decision influences subsequent decisions. Our paper, therefore, investigated to what extent the complexity of an initial purchase decision (specific vs. non-specific) influences visual attention during the next decision as this is the reality of most decisions made in a store 22.
The protocol described here is organized in the same chronological order as a typical research study. First, the definition of a research question and the study design is described, after which the choice of eye-tracking equipment is delineated. Next, the different steps of the data collection procedure are explained and, lastly, the data processing is outlined. Throughout the protocol, differences in procedure due to lab or field-based data collection are clearly stated.
The protocol outlined below is in line with the current ethics regulations of the authors’ institutions. In order to ensure this, important aspects of the design are: voluntary participation, the use of ordinary shopping tasks as experimental stimuli or instructions, and no collection of personal data. However, as ethics regulations can differ between institutions, please consult the local institution’s human research ethics committee before conducting any research.
1. Experimental design and stimuli
2. Choice of eye-tracking equipment
3. Data collection procedure
4. Data processing
Stock Level Findings
A total of 185 participants had complete eye-tracking recordings and were included in the study. We based our analysis on those participants who detected the packaging element within the time limit of 7.0 seconds. Thus, our dependent variable was time to first fixation (TTFF), which in this case represents the time it took from stimulus exposure until participants detected and thus fixated on the packaging element in question (measured in milliseconds but depicted in seconds). Fixations are the most commonly reported data points in eye-tracking research and are valid measures of visual attention24,25,26,27. TTFF did not differ between the two textual elements (F < 1), and these stimuli did not interact with location to influence TTFF (F < 1). Therefore, we combined them into a single text condition to facilitate parsimonious analyses, after which we conducted a 2 (Location: Left, Right) × 2 (Stimuli: Textual, Pictorial) between-subjects Analysis of Variance (ANOVA) on TTFF. The ANOVA revealed no main effect of Location (F < 1), no main effect of Stimuli (F(1, 114) = 1.09, p = .30), but did reveal a statistically significant two-way interaction (F(1, 114) = 4.46, p = .011). Inspection of cell means revealed that the pictorial packaging element was detected quicker when located on the right (M = 2.27) versus left (M = 3.82) side on the package, whereas the textual packaging elements were detected quicker when located on the left (M = 2.08) versus right (M = 3.01) side on the package; see Figure 1. Thus, the results on detection time for textual and pictorial packaging elements support the element organization advocated by the preference view5,6 rather than the recall view3 and suggest that preference may be function of easy information acquisition.
Figure 1: TTFF in seconds as a function of packaging element (textual, pictorial) and location (left, right). Please click here to view a larger version of this figure.
Shelf Level Findings
A total of 128 participants had complete eye-tracking recordings and were included in the study. The dependent variable was TTFF on target, which here means the time it took from stimulus exposure until participants fixated on either a premium product or a budget product, depending on their randomly assigned experimental condition and the shelf configuration (again measured in milliseconds but depicted in seconds). A 2(Congruency: Congruent, Incongruent) × 2 (Search Task: Premium, Budget) between-subjects ANOVA on TTFF on target showed a significant main effect of Congruency (F(1, 122) = 7.72, p = .006), where participants detected the target faster in the congruent condition (M = 0.94) than in the incongruent one (M = 1.45). Thus, regardless of search task, participants generally detected the target quicker when it was located on the vertical position that best serves as a cue of its value (e.g., premium products on the congruent top position instead of the incongruent bottom position). There was also a significant main effect of Search Task (F(1, 122) = 6.78, p = .010), where the budget search task led to faster target detection (M = 0.96) than the premium search task (M = 1.43). These two main effects were qualified by a significant two-way interaction (F(1, 122) = 78.57, p < .001). Inspection of cell means revealed that, for the premium product, participants noted the target faster in the congruent (top) location (M = 0.37) than in the incongruent (bottom) location (M = 2.50). For the budget product, however, participants noted the target faster in the incongruent (top) location (M = 0.40) than in the congruent (bottom) location (M = 1.51); see Figure 2. Taken together, these results show that participants tend to move their gaze upwards independent of task; however, they turn their gaze downward faster in a budget task than in a premium task.
Figure 2: TTFF on Target in seconds as a function of search task (premium, budget) and congruency (congruent, incongruent). Please click here to view a larger version of this figure.
Store Level Findings
The study included 66 participants with complete eye-tracking data. The dependent variable was the number of observations on the areas of interest (AOIs), with the AOIs defined for all relevant portions of the store (parts of the store that was not of interest for the analysis were not coded). The number of observations on an area is a frequently used measure in eye-tracking studies and serves as an indicator of interest28,29. A 2 (Task Specificity: Specific, Non-Specific) x 2 (Choice Task: First, Second) mixed ANOVA with the number of observations on AOIs as dependent variable, choice task as the repeated measure, and task specificity as the between-subjects factor. The results showed no significant main effect of the between-subjects factor (F(1, 64) = 1.71, p = .20). However, there was a significant main effect of choice task (F(1, 64) = 12.16, p < .001) were the first choice was completed with fewer observations (M = 19.20) than the latter (M = 25.08). However, this main effect was qualified by a significant two-way interaction (F(1, 64) = 11.42, p = .001). Inspection of cell means revealed that participants in the specific choice group observed a fairly equal number of AOIs during their first (specific) choice task (M = 23.39) and their subsequent choice task (M = 23.58). In contrast, participants in the non-specific choice group observed a smaller number of AOIs during their first-choice task (M = 15.00) compared to the second choice task (M = 26.58); see Figure 3. These results show how the specificity of an initial shopping goal influences customers’ visual search behavior during a choice task and how such a choice task affects the visual behavior during sub-sequent choices.
Figure 3: Number of observations on AOIs as a function of task specificity (specific, non-specific) and choice task (first, second). Please click here to view a larger version of this figure.
In this article, we have used some of our prior research studies to illustrate a new conceptualization of the in-store search process. Specifically, our 3S Model – with its Stock, Shelf, and Store levels – offers a new way to examine customers’ visual attention from a process perspective by means of eye-tracking methodology. Previous research has typically divided the in-store search process in broad terms such as navigation and decision-making1 or movement and contact2. The contribution of our 3S Model is that it captures the various layers of the retail environment and the linkages between these different S levels in a more nuanced way.
As with all research, the most critical aspect is the design of the experiments. Thus, taking the time to properly design one’s study is crucial to the success of the study. Furthermore, as the protocol described above includes the choice between lab and field settings, which is also a choice between a stationary and a head-mounted mobile eye-tracking system, this has to be considered during the design.
This protocol is limited with respect to the detailed instructions regarding the eye-tracking equipment. Since there are multiple producers of eye-tracking hardware and software, this protocol does not include any specific usage instructions as this is simply not feasible. Please consult the manual of the specific eye-tracking equipment.
From a theoretical point of view, the 3S Model allows researchers to more precisely position their studies and to narrow the aim of each experiment. By dividing the in-store search process into the three components of our model, researchers acknowledge and take into account more of the complexities of in-store decision-making. As shown by the provided sample studies, the choice of a specific product can be understood from the design of its packaging, its placement in a shelf, and the goal of the customer. Thus, it is important to understand which part of the in-store search process that is currently in focus.
From a practical perspective, the 3S Model clearly shows which parts of the in-store search process that are appropriate to investigate in the eye-tracking laboratory versus in the field. Studies under controlled lab-conditions facilitate digital manipulation of experimental stimuli on computer or projector screens and automatic coding of a large variety of eye-tracking measures with high levels of accuracy, but at the expense of low ecological validity. Such studies are better suited for examining research questions at the Stock and Shelf levels using stationary eye-tracking systems, due to the difficulty in manipulating shelf layout or packaging elements on consumer goods in real retail settings. Studies in actual field settings have high ecological validity, but lower degrees of experimental control and are typically more labor-intense as they require manual coding of eye-tracking measures (with lower levels of accuracy). Such studies are particularly well-suited for examining research questions at the Store level, but can also be used at the Shelf level, through reliance on mobile eye-tracking equipment.
The authors have nothing to disclose.
This research was conducted within the Service Innovation for Sustainable Business (SISB) grant, funded by the Swedish Knowledge Foundation (KK-stiftelsen).
Eye tracker | Tobii Technology | Tobii X120 Eye Tracker | Stationary eye-tracking system |
Eye tracker | Tobii Technology | Tobii Glasses | Head-mounted eye-tracking system |