Here are problems that are similar to the ones you might see on the exam. Be sure to also review old quiz and workshop questions too. The exam will have both multiple choice and short answer questions.

Data Basics

Know all the terminology: Populations, Samples, Individuals, Variables, Statistics, Parameters, etc.

Make sure you know the difference between random error and bias. What is the best way to avoid bias? What is the best way to minimize random error? Know the difference between explanatory variables, response variables, and lurking variables. Also, make sure that you understand why randomized controlled experiments let you establish cause and effect but observational studies do not.

Make sure you know all of the different ways to plot a quantitative variable: Boxplots, Stemplots, Histograms. There will also be questions about how we measure the shape, center, and spread of a quantitative variable.

In the following exercises, the slope of the regression line is the estimate in the regression table for the row corresponding to the explanatory variable. The y-intercept is the estimate for the (Intercept) row. For example, in exercise 8.26, the slope is the number 4.034 and the y-intercept is the number -0.357.

Make sure you know how to make a stacked bar graph showing percentages, and how to find a relative risk from a contingency table. Be sure you know what it means for two categorical variables to be associated/independent.