Chapter 11: Correlation and Regression

11.8:

Variation

11.8:

Variation

JoVE Core
Statistik

Zum Anzeigen dieser Inhalte ist ein JoVE-Abonnement erforderlich. Melden Sie sich an oder starten Sie Ihre kostenlose Testversion.

JoVE Core Statistik

Variation

Vorheriges Video
11.7: Residual Plots

Nächstes Video
11.9: Prediction Intervals

4,975 Views

•

00:00 min

•

April 30, 2023

An important characteristic of any set of data is the variation in the data. In some data sets, the data values are concentrated closely near the mean; in other data sets, the data values are more widely spread out from the mean. The most common measure of variation, or spread, is the standard deviation, which is the square root of variance.

When independent and dependent variables are plotted on a scatter plot, the slope of a line is a value that describes the rate of change between the two variables. The slope tells us how the dependent variable (y) changes for every one unit increase in the independent (x) variable, on average. The y-intercept describes the dependent variable when the independent variable equals zero. A regression line, or a line of best fit, can be drawn on a scatter plot and used to predict outcomes for the x and y variables in a given data set or sample data.

The difference between the observed sample value, y, and the predicted value, from the regression equation, is known as unexplained deviation. Whereas the difference between the predicted value and the sample mean, y̅, is called the explained deviation. The difference between the observed value, y, and the sample mean, y̅, is the total deviation.

If you add the squares of the explained deviations for all data points, we get the explained variation. In the same way, if we add the squares of the unexplained deviations for all data points, we get the unexplained variation. Also, if we add the squares of the total deviations for all data points, we get the total variation. Dividing the explained variation by the total deviation gives us the value of the coefficient of determination, r², which represents the percent of the variation in the dependent variable y that can be explained by variation in the independent variable x using the regression line.

This text is adapted from Openstax, Introductory Statistics, Section 12, Linear Regression and Correlation.

Tags

Variation Data Sets Standard Deviation Variance Scatter Plot Slope Independent Variable Dependent Variable Regression Line Predicted Value Unexplained Deviation Explained Deviation Total Deviation Explained Variation Unexplained Variation Coefficient Of Determination