7.4 Logistic regression
With a binary outcome measure, logistic regression is generally more appropriate than linear (OLS) regression. Use the glm() function to estimate a generalized model, and specify the model family as binomial within the arguments.
# create binary measure of "above average math proficiency"
dcps <- dcps %>%
mutate(AboveAvgMath = if_else(ProfMath > mean(ProfMath),1,0))
Model3 <-
glm(
AboveAvgMath ~ ProfLang + NumTested, # specify model
family = 'binomial', # logistic estimation
data = dcps
)To view the coefficient estimates and evaluate hypotheses, again apply the summary() function to the model object.
##
## Call:
## glm(formula = AboveAvgMath ~ ProfLang + NumTested, family = "binomial",
## data = dcps)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.224266 0.612526 -5.264 1.41e-07 ***
## ProfLang 0.113658 0.024119 4.712 2.45e-06 ***
## NumTested -0.002393 0.002292 -1.044 0.296
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 144.342 on 107 degrees of freedom
## Residual deviance: 77.127 on 105 degrees of freedom
## AIC: 83.127
##
## Number of Fisher Scoring iterations: 6
## (Intercept) ProfLang NumTested
## 0.03978499 1.12036941 0.99760944
The results indicate that a percentage-point increase in a school’s language proficiency is expected to raise the odds of being above average in math by 12%, conditional on the number of students tested. Again, the increase is significant (\(p < 0.001\)).