Evaluating the Accuracy of Snap Judgments

JoVE Science Education
Social Psychology
このコンテンツを視聴するには、JoVE 購読が必要です。  サインイン又は無料トライアルを申し込む。
JoVE Science Education Social Psychology
Evaluating the Accuracy of Snap Judgments

20,726 Views

08:04 min

April 30, 2023

概要

Source: Diego Reinero & Jay Van Bavel—New York University

Social psychologists have long been interested in the way people form impressions of others. Much of this work has focused on the errors people make in judging others, such as the exaggerated influence of central traits (such as "warm" and "cold"), the insufficient weight given to the context in which others' behavior takes place, and the tendency for people to make judgments that conform to their initial expectations about another. However, this focus on errors masks the fact that people are quite good at making fairly accurate judgments about other people's characteristics, an ability that was no doubt important over the course of human evolution.

Indeed, the human ability to make quick sense of social situations and people ranks among our most valuable skills. What is particularly impressive about our ability to make sense of others is not just how little information we need to make inferences, but how well calibrated we can be with so little information. This video shows some experimental techniques used by psychology researchers, including Ambady and Rosenthal in their seminal work,1 and explores the process of making inferences in the context of students' evaluations of their teachers.

原則

In much of the early research, the judgments were based on exposures to both verbal and nonverbal channels of the targets' behavior. However, later research suggested that strangers' judgments might be related to the presence of observable cues, particularly nonverbal behavior and physical appearance cues. Thus, new experiments were designed (like the present) to examine the mediating effect of nonverbal behavior and physical appearance on the accuracy of personality judgments. In these experiments, researchers controlled the information available to raters so that targets were rated solely on the basis of their nonverbal behavior, and they also obtained separate judgments of the physical attractiveness of targets to examine the relationship between physical appearance and the accuracy of personality judgments.

Another factor in determining the accuracy of snap judgments is the degree of correspondence between a judgment and a criterion. Early research used self-reported criterion; however, this data is susceptible to bias. Later studies, like the current techniques, avoid self-reports in favor of pragmatic and ecological valid criterion involving 3rd party reports (here, student evaluations).

A final principle involves the fact that in examining the accuracy of strangers' judgments regarding personality attributes of the targets from very minimal noninteractive information, researchers consider both molecular and molar nonverbal behavior. Psychologists define molecular behavior as behavior described in small response units (momentary, discrete responses) rather than larger ones. Molar behavior, on the other hand, is described in large response units that take up varying amounts of time.

手順

1. Organize materials.

  1. Create videos, which includes previously filmed footage of 10 college instructors. The content of the teaching should cover a broad array of subject areas.
  2. For each teacher, identify three separate, 10-s clips. The three clips should be taken from the beginning, middle, and end of class, respectively, and feature the teacher alone in the video frame.
  3. Following Latin-square designs, combine the three clips in random order; do this for each teacher. This will result in 30 total video clips.
  4. Compile the end-of-semester student evaluations for each of the 10 instructors in the videos. These evaluations are from the actual courses that correspond to the video footage.

2. Participant Recruitment

  1. Conduct a power analysis and recruit a sufficient number of participants to watch and rate the video clips.
    1. Women were preferred in the original study because prior research supports the notion that they are better than men at decoding nonverbal behavior.

3. Data Collection

  1. Tell participants to provide ratings of the instructor based on overall nonverbal behavior. In particular, ask participants to judge the teacher on 15 teaching-related dimensions (e.g., accepting, active, attentive, dominant, honest, likable, warm, professional, etc.) on a 9-point Likert scale (1= not at all; 9 = very).
    1. Participants should receive no other training.
  2. To assess molecular nonverbal behavior (i.e., specific nonverbal actions), have two paid, trained coders watch the same video clips.
  3. For each clip, tally the number of nods, headshakes, smiles, laughs, yawns, frowns, biting of the lips, downward gazes, self-touches, fidgets, emphatic gestures, and weak gestures the teachers made.
  4. Have the two raters also indicate the teacher's symmetry and body posture.
  5. To account for attractiveness effects, also have two coders judge the physical attractiveness of each teacher based on a 5-point Likert scale (1 = not at all; 5 = very) based on a single photo of each teacher taken from the video.
  6. Fully debrief participants.

4. Data Analysis

  1. Note that the primary outcome variable of interest is teaching effectiveness, which is assessed via ratings made by the teachers' students at the end of the semester.
    1. This includes two items that asked the students to rate the instructor's performance and the quality of the course overall.
  2. Convert these ratings to percentages.
    1. The mean percentage of the two evaluation items serves as the primary dependent variable of interest.
  3. Analyze the ratings of molar nonverbal behavior for reliability.
  4. From these data, compute an overall mean, which represents the molar nonverbal behavior composite score.
    1. Both the composite score and each of the individual 15 ratings are considered as potential predictors of teaching effectiveness.
  5. Analyze coders' ratings of molecular nonverbal behaviors for reliability.
  6. Analyze ratings of teachers' attractiveness for reliability.

Upon meeting new people, many individuals tend to make quick judgments of another person—even without much information to go on.

For instance, at a social gathering, someone might immediately think that the guy with cool glasses, whom they’ve never met, is likeable based solely on his appearance. As it turns out, he is easy-going and has a lot of friends.

Remarkably, people are surprisingly accurate when making these first impressions—referred to as snap judgments—simply based on visual cues.

Based on the seminal work of Ambady and Rosenthal, this video demonstrates the experimental techniques used to make snap judgments of instructors’ personalities in comparison to actual evaluations of their teaching effectiveness. We will also explore how such inferences can be applied to other professions that rely on analyzing characteristics.

In this study, participants are asked to watch short, muted video compilations of novel college instructors teaching a variety of subjects and must judge certain attributes. Other trained coders count more specific nonverbal behaviors, as well as rate their physical appearance.

These assessments are ultimately compared to actual teaching evaluations to examine the accuracy of first impressions based on visual traits and distinct, objective actions.

Participants first provide molar ratings—broad trait judgments—based on 15 teaching-related dimensions, such as whether they seem enthusiastic, likeable, and confident. The Likert scale ranges from 1 (not at all) to 9 (very).

In addition, research assistants watch the same clips and tally molecular behaviors—actions that are momentary and discrete—like smiling or nodding. They are also asked to report on the teacher’s symmetry and body posture.

Lastly, based on a single photo taken from the videos, the assistants are asked to rate the physical appeal of each instructor on a 5-point Likert scale, where 1 means “not at all” and 5 equals “very”, to account for effects of attractiveness.

To examine the predictive utility of these snap judgments, each instructor’s end-of-semester teaching evaluations are compiled for nonbiased quantitative comparisons.

Using these forms, the dependent variable is teaching effectiveness, based on averaging two items where students rated the instructors’ performances and overall quality of the courses.

Ultimately, participants’ assessments of molar nonverbal behaviors—given 30 s of film from one day of instruction—are expected to be highly correlated with students’ evaluations of their instructors, which are based on a much longer span—a semester’s worth of interaction.

These findings suggest that very little time is needed to make an accurate first impression, which is known as thin slicing—the ability to quickly infer another person’s character from a very short exposure.

Prior to the experiment, conduct a power analysis to recruit a sufficient number of participants. Additionally, use previously filmed footage of ten college instructors to generate three separate, 10-s clips from each to end up with a total of 30 videos.

For every one, capture a frame to save as their photo for subsequent observations. To complete preparation, compile the end of semester student evaluations for each of the 10 instructors shown, from the actual courses that correspond to the footage.

To begin, escort each participant into the testing room and explain that they will watch videos and assess molar nonverbal behavior—in this case personality traits.

As they view each set of randomized clips, have them judge every instructor’s nonverbal behavior—15 teaching-related adjectives—on a 9-point Likert scale.

Next, to measure molecular nonverbal behavior, ask two trained coders to watch the same segments and tally the number of times each instructor makes one of 12 distinct behaviors, along with details about their symmetry and body posture.

Lastly, to account for the effects of attractiveness, have each coder view the saved images and judge the physical appearances of each instructor on a 5-point Likert scale.

To conclude the experiment, fully debrief participants regarding the actual purpose and procedures of the study.

To compile the data, make sure that the two evaluation responses have been converted into percentages and averaged for each instructor.

Then, create separate graphs to compare the mean values of molar and molecular categories against teaching effectiveness. Plot the correlations for each nonverbal behavior measured.

First, notice that 10 of the 15 molar ratings of nonverbal behavior were significantly and positively correlated, including the overall composite average—the global variable.

However, molecular behaviors were less predictive. Only fidgeting negatively correlated with teaching effectiveness. Moreover, the relationships remained even after controlling for instructor attractiveness.

In the end, students were able to formulate reliable impressions of instructors’ teaching effectiveness using only 30 s of nonverbal video footage.

Now that you are familiar with how to design a study to evaluate snap judgments in an educational setting, let’s look at how this research extends to other professions that rely on quick inferences to understand other people’s character.

During a poker game, many players rely on snap judgments to size up their competition. Those who make quick inferences about their opponents’ playing style—solely based on a limited amount of visual cues—can win the pot.

However, maintaining accuracy when thin-slicing others depends largely on knowing which factors are important. For example, researchers have shown that divorce can be predicted above chance levels by viewing a very short video of a couple interacting.

In this case, the expected behaviors of complaining or anger did not predict divorce, but rather, defensiveness and withdrawing did. Thus, it may be that implicitly or explicitly learning to attune to the right signals is crucial to developing this expertise.

You’ve just watched JoVE’s video on how to evaluate the accuracy of snap judgments. Now you should have a good understanding of how to design, conduct, and analyze an experiment to study how only a short time is needed to make predictive inferences, as well as how this skill can be useful in other professions.

Thanks for watching!

結果

Results indicated that nine of the 15 molar ratings of nonverbal behavior positively correlated with end-of-semester ratings of teacher effectiveness (Figure 1), as did the overall mean molar rating. Molecular behaviors, on the other hand, were less predictive (Figure 2); only frowning and fidgeting (negatively) correlated with teaching effectiveness (Figure 3). Teacher attractiveness did not significantly relate to teacher effectiveness. More importantly, the effects of nonverbal behavior remained even after statistically controlling for attractiveness. Thus, when given only 30 s of film from one day of instruction, assessments of nonverbal behaviors correlated very highly with students' impressions of their teachers based on a semester's worth of contact.

Figure 1
Figure 1: Correlations of molar nonverbal behaviors and students' ratings of teacher effectiveness. Molar nonverbal behaviors (i.e., trait judgments) from the thin-slice video clips were correlated with students' end-of-semester ratings of teacher effectiveness. Ten of the 15 molar nonverbal behaviors predicted ratings of teacher effectiveness (optimistic, confident, dominant, active, enthusiastic, global variable, likable, warm, competent, supportive).

Figure 2
Figure 2: Correlations of molecular nonverbal behaviors and molar global rating. Molecular nonverbal behaviors (i.e., specific nonverbal actions) from the thin-slice video clips were correlated with the teachers' overall molar global rating. Only frowning was negatively correlated.

Figure 3
Figure 3: Correlations of molecular nonverbal behaviors and students' ratings of teacher effectiveness. Molecular nonverbal behaviors (i.e., specific nonverbal actions) from the thin-slice video clips were correlated with students' end-of-semester ratings of teacher effectiveness. Fidgeting was negatively correlated with the criterion variable: Teachers who fidgeted more with their hands or fiddled with an object, such as chalk or a pen, received significantly lower ratings from their students.

Applications and Summary

The described technique demonstrates that observing just 30 s of behavior is enough to draw accurate inferences about teaching effectiveness. Ambady and Rosenthal repeated this study using even shorter clips and found similar effects: Judgments based on three clips as short as 2.0 s yield high correlations with end-of-semester ratings.1 Knowledge of the nonverbal correlates of effective teaching help our understanding of the importance of affective behavior in teaching and learning processes, and are also of practical importance in guiding the selection and training of future teachers.

This body of research has been extended beyond judgments of teaching effectiveness. Research has shown that people can judge how trustworthy an individual is just from viewing a photo of that person for a few brief seconds. Others have found that people can predict who will win an election with above-chance accuracy just based on viewing photos of the candidates. The ability to quickly infer another person's character or traits from a brief exposure is referred to as thin slicing.

People can thin slice others with an impressive degree of accuracy. However, this accuracy depends in large part on knowing which factors are important. For example, research conducted by relationship expert John Gottman and colleagues shows that divorce can be predicted at higher-than-chance levels by viewing a thin slice of video of a couple interacting.2 Gottman identified four key behaviors that predict divorce: criticism, contempt, defensiveness, and withdrawing/stonewalling. Interestingly, complaining and anger actually do not predict divorce.

In addition, many professions rely on accurate thin-slicing, ranging from detective work and personal security to poker playing and psychic reading. It may be that implicitly or explicitly learning to attune to the right signals is crucial to developing this expertise.

参考文献

  1. Ambady, N. & Rosenthal, R. (1993). Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. Journal of Personality and Social Psychology, 64, 431-441.
  2. Gottman, J. M., Coan, J., Carrere, S., & Swanson, C. (1998). Predicting marital happiness and stability from newlywed interactions. Journal of Marriage and Family, 60, 5-22.

筆記録

Upon meeting new people, many individuals tend to make quick judgments of another person—even without much information to go on.

For instance, at a social gathering, someone might immediately think that the guy with cool glasses, whom they’ve never met, is likeable based solely on his appearance. As it turns out, he is easy-going and has a lot of friends.

Remarkably, people are surprisingly accurate when making these first impressions—referred to as snap judgments—simply based on visual cues.

Based on the seminal work of Ambady and Rosenthal, this video demonstrates the experimental techniques used to make snap judgments of instructors’ personalities in comparison to actual evaluations of their teaching effectiveness. We will also explore how such inferences can be applied to other professions that rely on analyzing characteristics.

In this study, participants are asked to watch short, muted video compilations of novel college instructors teaching a variety of subjects and must judge certain attributes. Other trained coders count more specific nonverbal behaviors, as well as rate their physical appearance.

These assessments are ultimately compared to actual teaching evaluations to examine the accuracy of first impressions based on visual traits and distinct, objective actions.

Participants first provide molar ratings—broad trait judgments—based on 15 teaching-related dimensions, such as whether they seem enthusiastic, likeable, and confident. The Likert scale ranges from 1 (not at all) to 9 (very).

In addition, research assistants watch the same clips and tally molecular behaviors—actions that are momentary and discrete—like smiling or nodding. They are also asked to report on the teacher’s symmetry and body posture.

Lastly, based on a single photo taken from the videos, the assistants are asked to rate the physical appeal of each instructor on a 5-point Likert scale, where 1 means “not at all” and 5 equals “very”, to account for effects of attractiveness.

To examine the predictive utility of these snap judgments, each instructor’s end-of-semester teaching evaluations are compiled for nonbiased quantitative comparisons.

Using these forms, the dependent variable is teaching effectiveness, based on averaging two items where students rated the instructors’ performances and overall quality of the courses.

Ultimately, participants’ assessments of molar nonverbal behaviors—given 30 s of film from one day of instruction—are expected to be highly correlated with students’ evaluations of their instructors, which are based on a much longer span—a semester’s worth of interaction.

These findings suggest that very little time is needed to make an accurate first impression, which is known as thin slicing—the ability to quickly infer another person’s character from a very short exposure.

Prior to the experiment, conduct a power analysis to recruit a sufficient number of participants. Additionally, use previously filmed footage of ten college instructors to generate three separate, 10-s clips from each to end up with a total of 30 videos.

For every one, capture a frame to save as their photo for subsequent observations. To complete preparation, compile the end of semester student evaluations for each of the 10 instructors shown, from the actual courses that correspond to the footage.

To begin, escort each participant into the testing room and explain that they will watch videos and assess molar nonverbal behavior—in this case personality traits.

As they view each set of randomized clips, have them judge every instructor’s nonverbal behavior—15 teaching-related adjectives—on a 9-point Likert scale.

Next, to measure molecular nonverbal behavior, ask two trained coders to watch the same segments and tally the number of times each instructor makes one of 12 distinct behaviors, along with details about their symmetry and body posture.

Lastly, to account for the effects of attractiveness, have each coder view the saved images and judge the physical appearances of each instructor on a 5-point Likert scale.

To conclude the experiment, fully debrief participants regarding the actual purpose and procedures of the study.

To compile the data, make sure that the two evaluation responses have been converted into percentages and averaged for each instructor.

Then, create separate graphs to compare the mean values of molar and molecular categories against teaching effectiveness. Plot the correlations for each nonverbal behavior measured.

First, notice that 10 of the 15 molar ratings of nonverbal behavior were significantly and positively correlated, including the overall composite average—the global variable.

However, molecular behaviors were less predictive. Only fidgeting negatively correlated with teaching effectiveness. Moreover, the relationships remained even after controlling for instructor attractiveness.

In the end, students were able to formulate reliable impressions of instructors’ teaching effectiveness using only 30 s of nonverbal video footage.

Now that you are familiar with how to design a study to evaluate snap judgments in an educational setting, let’s look at how this research extends to other professions that rely on quick inferences to understand other people’s character.

During a poker game, many players rely on snap judgments to size up their competition. Those who make quick inferences about their opponents’ playing style—solely based on a limited amount of visual cues—can win the pot.

However, maintaining accuracy when thin-slicing others depends largely on knowing which factors are important. For example, researchers have shown that divorce can be predicted above chance levels by viewing a very short video of a couple interacting.

In this case, the expected behaviors of complaining or anger did not predict divorce, but rather, defensiveness and withdrawing did. Thus, it may be that implicitly or explicitly learning to attune to the right signals is crucial to developing this expertise.

You’ve just watched JoVE’s video on how to evaluate the accuracy of snap judgments. Now you should have a good understanding of how to design, conduct, and analyze an experiment to study how only a short time is needed to make predictive inferences, as well as how this skill can be useful in other professions.

Thanks for watching!