The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as follows:
where:
O = observed values (data), and E = expected values (from theory)
The observed values are the data values, and the expected values are the values you would expect to get if the null hypothesis were true. It is important to note that each cell’s expected needs to be at least five to use this test. The number of degrees of freedom is , where k = the number of different data cells or categories.
The goodness-of-fit test is almost always right-tailed. If the observed and the corresponding expected values are not close, the test statistic will be significant and located at the extreme right tail of the chi-square curve.