Statistical Inference Concepts

Below is a self-test quiz to see how comfortable you are with the concepts related to statistical inference. For each question, when you click on a choice it will tell you if you are right () or wrong () with a short explanation of why.

  1. Which of the following is not a reason why large sample sizes are better in statistics?

    A. Large samples reduce random error.✖ Large samples do reduce random error!

    B. Sample means and proportions from large samples are more likely to follow a normal distribution.✖ No. This is the central limit theorem.

    C. Large samples reduce sample bias.✔ Yes, this statement is false. Just because a sample is large doesn’t mean it is not biased.

    D. The inference formulas are more robust with large samples.✖ The inference formulas assume that the sample means or proportions are normally distributed. That assumption is less important when the sample sizes are large because of the central limit theorem.


  1. Why are low p-values significant?

    A. The p-value is the probability that the null hypothesis is true. If it is under 5%, then the null hypothesis is probably false.✖ This is a very common misconception of what a p-value is.

    B. A low p-value means that our results are unlikely if the null hypothesis is true. That suggests that maybe the null hypothesis is wrong.✔ Correct. Make sure to memorize the definition of a p-value since it is not intuitive!

    C. A low p-value means that there probably wasn’t any bias in the sample, so you can reject the null hypothesis✖ A p-value is a mathematical calculation. Math can’t fix bias!


  1. If your p-value is over 5%, what should you conclude?

    A. The p-value is large, so we should reject the null hypothesis.✖ This is backwards! Large p-values are not significant.

    B. The p-value is large, so the null hypothesis is probably true.✖ Never conclude that the null hypothesis is true.

    C. The p-value is large, so our results are inconclusive.✔ Correct. Another way to say this is that our results might be due to random chance.


  1. A researcher is studying the age when men and women get married for the first time. They find that men tend to be a little older than women when they first get married. The researcher makes a 2-sample t-distribution confidence interval to compare men versus women. The 95% confidence interval ends up being from 1.3 to 3.1 years. Which of the following is the best interpretation of this interval?",

    A. We are 95% sure that the average age when men get married the first time is between 1.3 and 3.1 years older than the average for women.✔ Correct. The 2-sample confidence interval is supposed to find the difference between the two population means.

    B. 95% of men are between 1.3 and 3.1 years older than their wives.✖ The 2-sample confidence interval is supposed to find the difference in population means, not 95% of the men.

    C. 95% of men are between 1.3 and 3.1 years older than 95% of wives✖ The 2-sample confidence interval is supposed to find the difference in population means, not 95% of the individuals.

    D. There is only a 5% chance that there is no difference between men and women.✖ Definitely not correct. It would have been correct if it had said that the difference between the average ages for men and women is not between 1.3 and 3.1 years.


  1. Suppose that in a sample of 30 married men, the average age when they first married was 26.7 years with a standard deviation of 5.1 years. What is the margin of error for a 95% confidence interval?",

    A. 1.904.✔ Correct. It looks like you used the right t-value for 29 degrees of freedom.

    B. 1.901 .✖ This is really close, but did you use 30 degrees of freedom instead of 29?

    C. 1.825.✖ That’s close, but you should use a critical t-value instead of z*=1.96.

    D. 13.7.✖ Not close. Did you use the mean instead of the standard deviation in the formula?


  1. A 2005 study looked at whether swimming with dolphins for two weeks in Honduras was more beneficial for patients with clinical depression than swimming a similar amount in Honduras, but without dolphins. A group of 30 volunteers with clinical depression were randomly assigned to either swim with dolphins, or to swim without. After two weeks, they were tested to see if their symptoms of depression had improved. Here are the results from the study, in a two-way table.

    Dolphin Therapy

    Control Group

    Depression Improved

    10

    3

    Didn’t Improve

    5

    12


    What would be the best way to determine if swimming with dolphins leads to a significant improvement in depression symptoms.

    A. Do a matched pairs t-test.✖ No, the people in these two groups are not matched, and the data is categorical, so a t-test doesn’t make sense.

    B. Do a two-sample t-test.✖ The data here is categorical, not quantitative. The numbers are counts of successes and failures, so you should do a hypothesis test for proportions.

    C. Do a two-sample test for proportions.✔ That’s correct. The response variable is categorical so it can be converted into percents, you have two samples, and you want to answer a yes/no question.

    Note: A χ2-test for association would also work here.

    D. None of the above.✖ No, there is definitely one answer above that is correct.


  1. Which of the following best explains why we use a t-distribution in statistics?

    A. A t-distribution helps compensate for the extra random error that comes from not knowing the population standard deviation.✔ Yes, that’s why you use t-distributions when you only know the sample standard deviation s but not the true standard deviation σ.

    B. A t-distribution helps compensate for bias in a sample.✖ No, math can’t fix bias, not even the t-distribution!

    C. t-distributions are better when you have two-samples.✖ Whether you use a t-distribution technique has nothing to do with whether you have one or two samples.

    D. We use t-distributions because they have degrees of freedom.✖ No, it’s the other way around actually. We need the t-distribution, and they just happen to have degrees of freedom.


  1. A 95% confidence interval for the mean reading achievement score for a population of third-grade students is from 44.2 to 54.2 points. Suppose you compute a 99% confidence interval using the same information. Which of the following statements is correct?

    A. The intervals have the same width.✖ No, something will need to change to increase the confidence level.

    B. The 99% interval is shorter.✖ Think of the confidence interval as a net that is supposed to catch the parameter of interest. If you make the net smaller, it will be less likely to catch the parameter.

    C. The 99% interval is longer.✔ Correct, by making the interval wider, it is more likely to contain the mean reading achievement score for the population.

    D. The answer can’t be determined from the information given.✖ No, there is enough information since the only thing that is changing is the confidence level, not the data."}