Logistic Regression for Binomial Counts

Read in the data

require(Sleuth3)
## Loading required package: Sleuth3
krunnit <- case2101

Let’s plot the logits of the proportions vs. the logits of the probability of going extinct.

require(gtools)
require(ggplot2)
krunnit$pi_hat <- krunnit$Extinct/krunnit$AtRisk
qplot(log(Area), logit(pi_hat), data=krunnit)

(Note: here, the interpretation is that the total number of species is AtRisk, and out of those, the number of specise that went extinct is Extinct)

fit <- glm(  cbind(Extinct, AtRisk-Extinct) ~ log(Area), family="binomial", data=krunnit)
summary(fit)
## 
## Call:
## glm(formula = cbind(Extinct, AtRisk - Extinct) ~ log(Area), family = "binomial", 
##     data = krunnit)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.71726  -0.67722   0.09726   0.48365   1.49545  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.19620    0.11845 -10.099  < 2e-16 ***
## log(Area)   -0.29710    0.05485  -5.416 6.08e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 45.338  on 17  degrees of freedom
## Residual deviance: 12.062  on 16  degrees of freedom
## AIC: 75.394
## 
## Number of Fisher Scoring iterations: 4
fit.sq <- glm(cbind(Extinct, AtRisk - Extinct) ~ log(Area) + I(log(Area)^2), family= "binomial", data= krunnit)
summary(fit.sq)
## 
## Call:
## glm(formula = cbind(Extinct, AtRisk - Extinct) ~ log(Area) + 
##     I(log(Area)^2), family = "binomial", data = krunnit)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.70543  -0.64658   0.02695   0.49182   1.47740  
## 
## Coefficients:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)    -1.21118    0.12971  -9.337  < 2e-16 ***
## log(Area)      -0.31780    0.09047  -3.513 0.000444 ***
## I(log(Area)^2)  0.00699    0.02430   0.288 0.773642    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 45.338  on 17  degrees of freedom
## Residual deviance: 11.979  on 15  degrees of freedom
## AIC: 77.311
## 
## Number of Fisher Scoring iterations: 4

(Note: if you do not enclose log(Area)^2 in I(), things don’t work. You have to either do that or add an additional column to the data frame with the value log(Area)^2)

(We haven’t covered this, but BIC is used similarly to AIC)

BIC(fit)
## [1] 77.17449
BIC(fit.sq)
## [1] 79.98255