Two students took a random sample of 30 textbooks for sale in a campus bookstore in 2006.
books=read.csv("TextPrices.csv")
hist(books$Price,col='gray',main='Textbook Prices',xlab='Price (dollars)')
The prices are clearly skewed to the right.
Below we compare two common transformations for right skewed data: logarithms and square-roots. As you can see, neither transformation makes the data very normal, but the square-root transform is more symmetric so it would be a better choice.
hist(log(books$Price),col='gray',main='Log-transformed data',xlab='log(dollars)')
qqnorm(log(books$Price))
qqline(log(books$Price))
summary(log(books$Price))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.447 2.865 4.007 3.696 4.562 5.134
hist(sqrt(books$Price),col='gray',main='Square-root transformed data',xlab='sqrt(dollars)')
qqnorm(sqrt(books$Price))
qqline(sqrt(books$Price))
summary(sqrt(books$Price))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.062 4.192 7.421 7.301 9.785 13.029