exo = read.csv('http://people.hsc.edu/faculty-staff/blins/classes/spring19/math222/examples/exoplanets.csv')
rain = read.csv('http://people.hsc.edu/faculty-staff/blins/StatsExamples/rainfall.csv')

Last time we saw that rainfall was approximately normally distributed:

library(ggplot2)
ggplot(data = rain, aes(x=total))+geom_histogram(binwidth=5,boundary=0,color='black',fill='gray')+labs(title="Annual Rainfall Total 1931 - 2011",x="Total Precipitation (inches)",y="Frequency")

On the other hand, the distance to observed exoplanets was skewed right.

ggplot(data = exo, aes(x=st_dist))+geom_histogram(binwidth=500,boundary=0,color='black',fill='gray')+labs(title="Distance to Exoplanets",x="Distance (parsecs)",y="Count")
## Warning: Removed 13 rows containing non-finite values (stat_bin).

ggplot(data = exo, aes(x=log(st_dist)))+geom_histogram(bins=20,boundary=0,color='black',fill='gray')+labs(title="Distance to Exoplanets",x="Log Distance (parsecs)",y="Count")
## Warning: Removed 13 rows containing non-finite values (stat_bin).

Quantile-Quantile Plots

This is what approximately normal data looks like in a qqplot.

qqnorm(rain$total)
qqline(rain$total)

And this is what extremely right-skewed data looks like.

qqnorm(exo$st_dist)
qqline(exo$st_dist)

And here is an example of moderately left-skewed data.

qqnorm(log(exo$st_dist))
qqline(log(exo$st_dist))

You can get the correlation between the x and y coordinates of a qqplot by using a command like:

myQQ = qqnorm(rain$total)

cor(myQQ$x,myQQ$y)
## [1] 0.9932586

Q-Q Plots with ggplots

Here is how to make a qqplot with the ggplots library.

ggplot(exo, aes(sample=st_dist))+stat_qq()+stat_qq_line() 
## Warning: Removed 13 rows containing non-finite values (stat_qq).
## Warning: Removed 13 rows containing non-finite values (stat_qq_line).