Detailed Project Instructions

Semester Project Data Set: You will either find your own data (see Finding Semester Project Data) or Dr. Gurney will give you a data set. Once you have your semester project data set, follow the instructions below to complete the semester project.

Due Date: A copy of the project is due the last Thursday before finals week. Send the project as an email attachment to Dr. Gurney.

Project Value: The semester project is worth 120 possible points. Projects turned in the last Friday before Finals week will earn at most 90 points, while a project turned in on Monday of Finals week will earn at most 30 points. Projects turned in after Monday of Finals week will receive no points.

Presentation Mode: Projects should be written using Microsoft Word. Use 12 point Times Roman font and one-inch margins. Paragraphs should be aligned to the left margin and there should be a 6 point space between paragraphs. All graphs should be created using a statistical software program like Excel, Minitab, R, Rguroo or SPSS. Numerical axes should be labeled with the quantity being measured and the correponding units, if any.

Project Identification: Make sure to include your name, your class section or meeting time, and the date (month/day/year) in the first four lines of the project.

Data Identification: In the first paragraph of your project, describe the source of your project data. If the data came from a website, give the date when the data was obtained from the website. If the data came from a periodical article or book, give the author, article title, journal or book title, and publication date of the article or book. If the data came from a survey you undertook, give a short description of how your survey participants were selected and give the date or dates when the survey answers were collected.

Qualitative Data Analysis: For the qualitative data, create a frequency distribution with frequencies and relative frequencies, and then construct a qualitative bar chart of the results. Once you have constructed the qualitative bar chart, write a paragraph that answers the following questions:

• How many classes (bars) are there?
• Which bar has the highest frequency?
• Which bar has the lowest frequency?

Next do a chi-square goodness of fit test at 95% confidence to see whether or not your qualitative outcomes are equally likely. Write a paragraph summarizing your results including the test statistic, the P-value, and your conclusion in English.

Quantitative Data Analysis: For each quantitative variable, create a histogram with four to six classes. Put at most two histograms on one page. The histograms can be made smaller to fit two on the same page. Create side-by-side horizontal box plots using the two comparable quantitative variables. Make a single horizontal boxplot for the third non-comparable quantitative variable. The three box plots could all be put on one page.

A written analysis should accompany the bar chart, each histogram, each box plot, and the scatter plot.

For each of the three histograms, write a paragraph that answers each of the following questions:

• What are the biggest and smallest data values?
• Is it symmetric, skewed to the left, or skewed to the right?
• How many peaks does it have, and where are they located?
• Does it have any gaps and, if so, where are they?
• Does it have any extreme values and, if so, what are they?

For the single box plot, you should say whether the box plot is symmetric, skewed to the right, or skewed to the left.

For the two boxplots over the same number line, write a paragraph that answers each of the following questions:

• Which box plot has the lowest value?
• Which box plot has the highest value?
• Which box plot has the largest range?
• Which box plot has largest interquartile range?
• Which box plot is more skewed?
• Which box plot has the largest median?

For the two comparable quantitative variables, test to see whether or not their means are equal at 95% confidence using the dependent difference test. Write the summary of this test in paragraph form including the test statistic, the P-value and your conclusion in English.

Create a table like the following in your project, and use your three quantitative variables to fill in the values. The headings "Variable 1", "Variable 2" and "Variable 3" should be replaced with the names of your three quantitaive variables.

 Statistic Variable 1 Variable 2 Variable 3 Minimum First Quartile Median Third Quartile Maximum Mean Mode Midrange Range Interquartile Range Standard Deviation Standard Error of the Mean Skewness Kurtosis

For each of your three quantitative variables, find and state the 95% confidence interval for the population mean.

For each of the three quantitative variables, use the Ryan-Joiner or Shapiro-Wilk test at 95% confidence to see whether or not the variable is normally distributed. State the test statistics and the P-values or critical values for each. Then state your conclusions on whether or not each of the variables is normally distributed.

For one of the two comparable quantitative variables and the non-comparable quantitative variable, do the following:

• Create a scatter diagram which includes the regression line. Make the scatter diagram as big as the space will allow. If you need another page for it, use another page.
• State the equation of the regression line below the scatter diagram.
• Find and interpret the coefficient of determination in terms of the scatter diagram you created.
• Identify any outliers or influential observations in the scatter diagram you created.
• Test to see if there is significant linear correlation between the two variables at 95% confidence. In the written summary of this test include the test statisitc or the correlation coefficient, the P-value, and your conclusion in English.