Chapter 1 Problems

1.23 Medical students. Students who have finished medical school are assigned to residencies in hospitals to receive further training in a medical specialty. Here is part of a hypothetical data base of students seeking residency positions. USMLE is the student's score on Step 1 of the national medical licensing examination.

Name Medical school Sex Age USMLE Specialty sought
Abrams, Laurie Florida F 28 238 Familty medicine
Brown, Gordon Meharry M 25 205 Radiology
Cabrera, Maria Tufts F 26 191 Pediatrics
Ismael, Miranda Indiana F 32 245 Internal medicine
  1. What individuals does this data set describe?
  2. In addition to the student's name, how many variables does the data set contain? Which of these variables are categorical and which are quantitative?

1.26 Facebook and MySpace audience Although most social-networking Web sites in the United States have fairly short histories, the growth of these sites has been exponential. By far, the two most visited social-networking sites are Facebook.com and MySpace.com. Here is the age distribution of the audience for the two sites in December 2009. FACEBOOK

  1. Draw a bar graph for the age distribution of Facebook visitors. Do the same for MySpace, using the same scale for the percent axis.
  2. Describe the most important difference in the age distribution of the audience for Facebook and MySpace. How does this difference show up in the bar graphs? Do you think it was important to order the bars by age to make the comparison easier?
  3. Explain why it is appropriate to use a pie chart to display either of these distributions. Draw a pie chart for each distribution. Do you think it is easier to compare the two distributions with bar graphs or pie charts? Explain your reasoning.

1.35 Where are the nurses? The following spreadsheet gives the number of active nurses per 100,000 people in each state. NURSES

  1. Why is the number of nurses per 100,000 people a better measure of the availability of nurses than a simple count of the number of nurses in a state?
  2. Make a histogram that displays the distribution of nurses per 100,000 people. Write a brief description of the distribution. Are there any outliers? If so, can you explain them?

Chapter 2 Problems

2.25 Incomes of college grads. According to the Census Bureau's 2010 Current Population Survey, the mean and median 2009 income of people at least 25 years old who had a bachelor's degree but no higher degree were $46,931 and $58,762. Which of these numbers is the mean and which is the median? Explain your reasoning.

2.32 Weight of newborns. The table below gives the distribution of the weight at birth for all babies born in the United States in 2008:

Weight (grams) Count         Weight (grams) Count
Less than 500 6,581 3,000 to 3,499 1,663,512
500 to 999 23,292 3,500 to 3,999 1,120,642
1,000 to 1,499 31,900 4,000 to 4,499 280,270
1,500 to 1,999 67,140 4,500 to 4,999 39,109
2,000 to 2,499 218,196 5,000 to 5,499 4,443
2,500 to 2,999 788,148
  1. For comparison with other years and with other countries, we prefer a histogram of the percents in each weight class rather than the counts. Explain why.
  2. How many babies are there?
  3. Make a histogram of the distribution, using percents on the vertical scale.
  4. What are the locations of the median and quartiles in the ordered list of all birth weights? In which weight classes do the median and quartiles fall?

2.50 Older Americans. The attached spreadsheet contains the percentage of Americans in each state who are aged 65 and older. The same data is shown in a stem-and-leaf plot below. OVER65

  The decimal point is at the |

     7 | 0
     8 | 8
     9 | 
    10 | 0139
    11 | 378888
    12 | 1114456689999
    13 | 012222334556688
    14 | 0011367
    15 | 035
    16 | 9
  1. Give the five-number summary of this distribution.
  2. Use the five-number summary to draw a boxplot of the data.
  3. Which observations does the \(1.5 \times IQR\) rule flag as suspect outliers? (The rule flags several observations that are not that extreme. The reason is that the center half of the observations are close together, so that the \(IQR\) is small. This example reminds us to use our eyes, not a rule, to spot outliers.)

Chapter 3 Problems

3.26 Daily activity. It appears that people who are mildly obese are less active than leaner people. One study looked at the average number of minutes per day that people spend standing or walking. Among mildly obese people, minutes of activity varied according to the N(373,67) distribution. Minutes of activity for lean people had the N(526,107) distribution. Within what limits do the active minutes for about 95% of the people in each group fall?

Mile per gallon. In its Fuel Economy Guide for model year 2010 vehicles, the Enviromental Protection Agency gives data on 1101 vehicles. There are a number of high outliers, mainly hybrid gas-electric vehicles. If we ignore the vehicles identified as outliers, however, the combined city and highway gas mileage of the other 1082 vehicles is approximately Normal with mean 20.3 miles per gallon (mpg) and standard deviation 4.3 mpg. Use this information in exercises 3.35 and 3.36.

3.35 In my Chevrolet. The 2010 Chevrolet Camaro with an eight-cylinder engine and automatic transmission has a combined gas mileage of 19 mpg. What percent of all vehicles have better gas mileage than the Camaro?

3.36 The bottom 10%. How low must a 2010 vehicle's gas mileage be in order to fall in the bottom 10% of all vehicles?

3.41 Heights of women. The heights of women aged 20 to 29 follow approximately the N(64.3,2.7) distribution. Men the same age have heights distributed as N(69.9,3.1). What percent of young women are taller than the mean height of young men?

3.42 Weights aren't Normal. The heights of people of the same sax and similar age follow a Normal distribution reasonably closely. Weights, on the other hand, are not Normally distributed. The weights of women aged 20 to 29 have mean 155.9 pounds and median 144.0 pounds. The first and third quartiles are 124.1 pounds and 173.7 pounds. What can you say about the shape of the weight distribution? Why?