Chi-Square Goodness of Fit Test

If some of the symbols or images below do not appear, try using Mozilla Firefox as your internet browser.

Brief Instructions

Enter the observed values under L1. Divide the sample size by the number of possible outcomes. The result is the expected value.

Put the cursor on L2 in the data entry window. Press "(" left parenthesis, 2ND, "1" and "-" minus sign. Enter the expected value. Then press ")" right parenthesis, "x2", "÷", and enter the expected value again. Pressing ENTER will fill in L2.

Now press STAT and choose CALC. Press ENTER, 2ND, "2" and ENTER. The test statistic is next to Sx = .

Subtract one from the number of possible outcomes to obtain the degrees of freedom. Next press 2ND and VARS, choose c2cdf( and press ENTER. Enter the test statistic, 1E99, and the number of possible outcomes minus one. Put commas between these three numbers and a right parenthesis after the last number. Pressing ENTER gives the P-value.

Detailed Instructions

This test is used to decide whether outcomes of a certain event are in the same proportions that were expected. We will always test whether or not all outcomes are equally likely. To demonstrate this procedure we will test whether the days of the week for 300 randomly selected pedestrian deaths are equally likely at 98% confidence (Sullivan, Fundamentals of Statistics, 2nd ed, Pearson Education, Inc. 2008, p.561). The data are shown below.

 Day Sun. Mon. Tues. Wed. Thurs. Fri. Sat Frequency 39 40 30 40 41 49 61

Assuming the days of the week are equally likely, you would expect 300/7 ≈ 42.9 deaths on average for each of the days. The test statistic for this test is given by the formula where k is the number of possible outcomes, Oi is the observed frequency, and Ei is the expected frequency. In this situation, k = 7 since there are 7 days of the week, the Oi are given in the table shown above, and since we are assuming each day is equally likely, Ei = 42.9 for each i. To aid in computing, we will rename the second row and add a third row to the table above.

 Day Sun. Mon. Tues. Wed. Thurs. Fri. Sat Oi 39 40 30 40 41 49 61 Ei 42.9 42.9 42.9 42.9 42.9 42.9 42.9

To save time, we will use the use the "list" operations of the calculator to do the necessary calculations. Press "STAT" and "ENTER". Clear lists L1 and L2. Enter the Oi under L1. When you are finished, the data entry screen should look like the following.

 L1 L2 39 40 30 40 41 49 61

Next put the cursor on L2 in the data entry window, and press "(" (left parenthesis), "2ND", "1", "-" (minus sign), "4", "2", "." (decimal point), "9", ")" (right parenthesis), "x2", "÷", "4", "2", "." (decimal point), "9". At this point, the formula at the bottom of the data entry window should look like:

 L 2 = (L1 - 42.9) 2 / 42.9

Now if you press "ENTER", the L2 list will be filled as shown below.

 L1 L2 39 .35455 40 .19604 30 3.879 40 .19604 41 .08415 49 .86737 61 7.6366

The numbers under L2 are the numbers to be summed in the right hand side of formula (1) above. To find the sum, press "STAT" and choose "CALC". Then press "ENTER", "2ND", "2" and "ENTER". The test statistic is found to the right of Sx = , and in this case is about 13.21.

The final step is to find the P-value for this test. To do this press "2ND" and "VARS" and choose " c2cdf(", and press "ENTER". You should see "c2cdf(" on your screen. The P-value in this case is the probability of obtaining a test statistic at least as big as 13.21 assuming that the null hypothesis holds. So we will enter the test statistic as the lower bound, "1E99" as the upper bound, and then the degrees of freedom with commas between the three numbers and a right parenthesis on the end.

For this example, there are 7 possible outcomes, one for each day of the week. The degrees of freedom is thus 7 - 1 = 6, and we would fill in the chi-square formula as follows.

 c2cdf(13.21, 1E99, 6)

Now pressing the "ENTER" button will produce the P-value. In this case, the P-value is 0.0398. Since we are testing at 98% confidence, the significance level is 1 - 0.98 = 0.02. The P-value is greater than the significance level, so we should keep the null hypothesis. Hence, the evidence is not strong enough at the 98% confidence level to say that the days of the week are not all equally likely.