Chapter 1 Problems

1.23 Medical students. Students who have finished medical school are assigned to residencies in hospitals to receive further training in a medical specialty. Here is part of a hypothetical data base of students seeking residency positions. USMLE is the student's score on Step 1 of the national medical licensing examination.

Name Medical school Sex Age USMLE Specialty sought
Abrams, Laurie Florida F 28 238 Familty medicine
Brown, Gordon Meharry M 25 205 Radiology
Cabrera, Maria Tufts F 26 191 Pediatrics
Ismael, Miranda Indiana F 32 245 Internal medicine
  1. What individuals does this data set describe?
  2. In addition to the student's name, how many variables does the data set contain? Which of these variables are categorical and which are quantitative?

1.26 Facebook and MySpace audience Although most social-networking Web sites in the United States have fairly short histories, the growth of these sites has been exponential. By far, the two most visited social-networking sites are Facebook.com and MySpace.com. Here is the age distribution of the audience for the two sites in December 2009. FACEBOOK

  1. Draw a bar graph for the age distribution of Facebook visitors. Do the same for MySpace, using the same scale for the percent axis.
  2. Describe the most important difference in the age distribution of the audience for Facebook and MySpace. How does this difference show up in the bar graphs? Do you think it was important to order the bars by age to make the comparison easier?
  3. Explain why it is appropriate to use a pie chart to display either of these distributions. Draw a pie chart for each distribution. Do you think it is easier to compare the two distributions with bar graphs or pie charts? Explain your reasoning.

1.35 Where are the nurses? The following spreadsheet gives the number of active nurses per 100,000 people in each state. NURSES

  1. Why is the number of nurses per 100,000 people a better measure of the availability of nurses than a simple count of the number of nurses in a state?
  2. Make a histogram that displays the distribution of nurses per 100,000 people. Write a brief description of the distribution. Are there any outliers? If so, can you explain them?

Chapter 2 Problems

2.25 Incomes of college grads. According to the Census Bureau's 2010 Current Population Survey, the mean and median 2009 income of people at least 25 years old who had a bachelor's degree but no higher degree were $46,931 and $58,762. Which of these numbers is the mean and which is the median? Explain your reasoning.

2.26 Saving for retirement. Retirement seems a long way off and we need money now, so saving for retirement is hard. Once every three years, the Board of Governors of the Federal Reserve System collects data on household assets and liabilities through the Survey of Consumer Finances (SCF). The most recent such survey was conducted in 2007, and the survey results were released to the public in April 2009. The survey presents data on household ownership of, and balances in, retirement savings accounts. Only 53.6% of households own retirement savings accounts. The mean values per household is $148,579, but the median value is just $45,000. For households in which the head of the household is under 35, 42.6% own retirement accounts, the mean is $25,279, and the median $9600. What explains the differences between the two measures of center, both for all households and for the under-35 age group?

2.21 The respitory system can be a limiting factor in maximal exercise performance. Research from the United Kingdom studied the effect of two breathing frequencies on both performance times and several physiological parameters in swimming. Subjects were 10 male collegiate swimmers. Here are their times in seconds to swim 200 meters at 90% of race pace when breathing every second stroke in front-crawl swimming. SWIMTIMES

  151.6  165.1  159.2  163.5  174.8 
  173.2  177.6  174.3  164.1  171.4 

Using a calculator or computer, the standard deviation is about

(a) 7.4.    (b) 7.8.    (c) 8.2.

2.23 The correct units for the standard deviation in Exercise 2.21 are

(a) no units--it's just a number.
(b) seconds. 
(c) seconds squared.

2.32 Weight of newborns. The table below gives the distribution of the weight at birth for all babies born in the United States in 2008:

Weight (grams) Count         Weight (grams) Count
Less than 500 6,581 3,000 to 3,499 1,663,512
500 to 999 23,292 3,500 to 3,999 1,120,642
1,000 to 1,499 31,900 4,000 to 4,499 280,270
1,500 to 1,999 67,140 4,500 to 4,999 39,109
2,000 to 2,499 218,196 5,000 to 5,499 4,443
2,500 to 2,999 788,148
  1. For comparison with other years and with other countries, we prefer a histogram of the percents in each weight class rather than the counts. Explain why.
  2. How many babies are there?
  3. Make a histogram of the distribution, using percents on the vertical scale.
  4. What are the locations of the median and quartiles in the ordered list of all birth weights? In which weight classes do the median and quartiles fall?

2.44 Athletes' salaries. The Montreal Canadiens were founded in 1909 and are the longest continuously operating professional ice hockey team. They have won 24 Stanley Cups, making them one of the most successful professional sports teams of the traditional four major sports of Canada and the United States. The linked spreadsheet gives the salaries of the 2010-2011 roster. Provide the team owner with a full description of the distribution of salaries and a brief summary of its most important features. HOCKEYSALARIES

2.50 Older Americans. The attached spreadsheet contains the percentage of Americans in each state who are aged 65 and older. The same data is shown in a stem-and-leaf plot below. OVER65

  The decimal point is at the |

     7 | 0
     8 | 8
     9 | 
    10 | 0139
    11 | 378888
    12 | 1114456689999
    13 | 012222334556688
    14 | 0011367
    15 | 035
    16 | 9
  1. Give the five-number summary of this distribution.
  2. Use the five-number summary to draw a boxplot of the data.
  3. Which observations does the \(1.5 \times IQR\) rule flag as suspect outliers? (The rule flags several observations that are not that extreme. The reason is that the center half of the observations are close together, so that the \(IQR\) is small. This example reminds us to use our eyes, not a rule, to spot outliers.)

Chapter 3 Problems

3.27 Low IQ test scores. Scores on the Wechsler Adult Intelligence Scales (WAIS) are approximately Normal with mean 100 and standard deviation 15. People with WAIS scores below 70 are considered mentally retarded when, for example, applying for Social Security disability benefits. According to the 68-95-99.7 rule, about what percent of adults are retarded by the criterion?

3.46 Normal is only approximate: IQ test scores. Here are the IQ test scores of 31 seventh-grade girls in a Midwest school district: MIDWESTIQ

    114  100  104   89  102   91  114  114  103  105  108
    130  120  132  111  128  118  119   86   72  111  103
    74   112  107  103   98   96  112  112   93
  1. We expect IQ scores to be approximately Normal. Make a stemplot to check that there are no major departures from Normality.
  2. Nonetheless, proportions calculated from a Normal distribution are not always very accurate for small numbers of observations. Find the mean \(\bar{x}\) and standard deviation \(s\) for the IQ scores. What proportion of the scroes are within one standard deviation of the mean? Within two standard deviations of the mean? What would these proportions be in an exactly normal distribution?