F-tests

knitr::opts_chunk$set(echo = TRUE)
require(Sleuth3)
require(ggplot2)
require(tigerstats)
require(xtable)
require(lattice)

pyg = case1302

First, let’s see that the F-test and the t-test are the same if we’re talking about just one variable

summary(lm(Score~Treat, data=pyg))

## 
## Call:
## lm(formula = Score ~ Treat, data = pyg)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -12.13  -7.20   1.30   5.20  13.07 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      71.632      1.688   42.45   <2e-16 ***
## TreatPygmalion    7.068      2.874    2.46   0.0206 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.356 on 27 degrees of freedom
## Multiple R-squared:  0.183,  Adjusted R-squared:  0.1528 
## F-statistic: 6.049 on 1 and 27 DF,  p-value: 0.0206

Now, let’s add in more covariates, and see how the p-value goes up

summary(lm(Score~Treat+Company, data=pyg))

## 
## Call:
## lm(formula = Score ~ Treat + Company, data = pyg)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -10.660  -4.147   1.853   3.853   7.740 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    68.39316    3.89308  17.568 8.92e-13 ***
## TreatPygmalion  7.22051    2.57951   2.799   0.0119 *  
## CompanyC10      4.23333    5.36968   0.788   0.4407    
## CompanyC2       5.36667    5.36968   0.999   0.3308    
## CompanyC3       0.19658    6.01886   0.033   0.9743    
## CompanyC4      -0.96667    5.36968  -0.180   0.8591    
## CompanyC5       9.26667    5.36968   1.726   0.1015    
## CompanyC6      13.66667    5.36968   2.545   0.0203 *  
## CompanyC7      -2.03333    5.36968  -0.379   0.7094    
## CompanyC8       0.03333    5.36968   0.006   0.9951    
## CompanyC9       1.10000    5.36968   0.205   0.8400    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.576 on 18 degrees of freedom
## Multiple R-squared:  0.5647, Adjusted R-squared:  0.3228 
## F-statistic: 2.335 on 10 and 18 DF,  p-value: 0.0564

summary(lm(Score~Treat+Company+Treat*Company, data=pyg))

## 
## Call:
## lm(formula = Score ~ Treat + Company + Treat * Company, data = pyg)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##   -9.2   -2.3    0.0    2.3    9.2 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 66.200      5.094  12.996 3.89e-07 ***
## TreatPygmalion              13.800      8.823   1.564   0.1522    
## CompanyC10                   4.500      7.204   0.625   0.5477    
## CompanyC2                    6.100      7.204   0.847   0.4191    
## CompanyC3                   10.000      8.823   1.133   0.2863    
## CompanyC4                    0.300      7.204   0.042   0.9677    
## CompanyC5                   10.000      7.204   1.388   0.1985    
## CompanyC6                   15.600      7.204   2.166   0.0585 .  
## CompanyC7                   -1.100      7.204  -0.153   0.8820    
## CompanyC8                    4.300      7.204   0.597   0.5653    
## CompanyC9                    6.900      7.204   0.958   0.3632    
## TreatPygmalion:CompanyC10   -0.800     12.477  -0.064   0.9503    
## TreatPygmalion:CompanyC2    -2.200     12.477  -0.176   0.8639    
## TreatPygmalion:CompanyC3   -21.800     13.477  -1.618   0.1402    
## TreatPygmalion:CompanyC4    -3.800     12.477  -0.305   0.7676    
## TreatPygmalion:CompanyC5    -2.200     12.477  -0.176   0.8639    
## TreatPygmalion:CompanyC6    -5.800     12.477  -0.465   0.6531    
## TreatPygmalion:CompanyC7    -2.800     12.477  -0.224   0.8275    
## TreatPygmalion:CompanyC8   -12.800     12.477  -1.026   0.3317    
## TreatPygmalion:CompanyC9   -17.400     12.477  -1.395   0.1966    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.204 on 9 degrees of freedom
## Multiple R-squared:  0.7388, Adjusted R-squared:  0.1875 
## F-statistic:  1.34 on 19 and 9 DF,  p-value: 0.3358

Of course, just because we can’t reject the hypothesis that everything is 0 based on the F-test, doesn’t mean everything is 0!

The output of anova(fit) compares the full model to the model with each group of indicators taken out. We can also do that explicitly

fit_full <- lm(Score~Treat+Company+Treat*Company, data=pyg)
fit_reduced <- lm(Score~Treat+Company, data=pyg)
anova(fit_full, fit_reduced)

## Analysis of Variance Table
## 
## Model 1: Score ~ Treat + Company + Treat * Company
## Model 2: Score ~ Treat + Company
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1      9 467.04                           
## 2     18 778.50 -9   -311.46 0.6669 0.7221

We can look that one up like this as well:

anova(fit_full)

## Analysis of Variance Table
## 
## Response: Score
##               Df Sum Sq Mean Sq F value  Pr(>F)  
## Treat          1 327.34  327.34  6.3080 0.03323 *
## Company        9 682.52   75.84  1.4614 0.29051  
## Treat:Company  9 311.46   34.61  0.6669 0.72212  
## Residuals      9 467.04   51.89                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1