The following data compares students college GPAs and Math SAT scores with whether or not they passed Math 140 at Hampden-Sydney.
calc = read.csv("calculusResults.csv")
head(calc)
## mathSAT collegeGPA passed
## 1 630 3.261 1
## 2 675 3.932 1
## 3 620 3.597 1
## 4 595 3.540 1
## 5 630 3.289 1
## 6 515 3.187 1
In this case the two explanatory variables (SAT score and GPA) are both quantitative, but the response variable is a binary categorical variable: Pass/Fail. Below is an example of a simple logistic regression model based on only one explanatory variable: collegeGPA.
successModel = glm(passed~collegeGPA,data=calc,family="binomial")
successModel
##
## Call: glm(formula = passed ~ collegeGPA, family = "binomial", data = calc)
##
## Coefficients:
## (Intercept) collegeGPA
## -4.933 2.228
##
## Degrees of Freedom: 149 Total (i.e. Null); 148 Residual
## Null Deviance: 160.6
## Residual Deviance: 135.9 AIC: 139.9
Here is a plot of the data and the logistic regression model.
plot(calc$collegeGPA,calc$passed)
curve(predict(successModel,data.frame(collegeGPA=x),type='response'),add=T)
You can use this logistic regression model to predict the log-odds of a student passing Math 140.
predict(successModel,data.frame(collegeGPA=2.0))
## 1
## -0.476925