Solution to STA 250 Assignment #2 (Fall 1999), First Data Set

Here is a model solution to Assignment #2 (Postscript, PDF), for one particular set of data (and here is another). This solution continues with the analysis of this data set from where the solution to Assignment #1 left off. The analysis below assumes that the data has been cleaned up as in Assignment #1, and that new Minitab columns have been created as described there. Some new columns were created for this assignment. You can view the final version of the worksheet with all the new columns in Postscript or PDF.

The discussion below is interspersed with portions of Minitab output. There are also links to a few plots, in Postscript or PDF formats. The Postscript is best viewed on CQUEST by choosing "0.707" from the "1.000" menu.

Effect of grain on illness

To address the question of whether feeding any grain (even a small amount) causes illness, I did a two-sample t-test comparing the days of illness for the 8 animals fed no grain with the days of illness for the 8 animals fed 0.5kg of grain. To do this, I created a new column called 'test1' set to '0' for the animals fed no grain, '1' for the animals fed 0.5kg of grain, and '*' for the animals fed more than 0.5kg of grain. The two groups can then be compared with the twot command, as follows (with the other animals being ignored):

MTB > twot 'ill' 'test1'

Two Sample T-Test and Confidence Interval

Two sample T for ill

test1       N      Mean     StDev   SE Mean
0           8      4.87      3.40       1.2
1           8      3.12      1.46      0.52

95% CI for mu (0) - mu (1): ( -1.2,  4.71)
T-Test mu (0) = mu (1) (vs not =): T = 1.34  P = 0.21  DF = 9

The two-sided p-value of 0.21 indicates that there is essentially no evidence that the mean days of illness differs for animals fed no grain and animals fed 0.5kg of grain. This conclusion is only valid if the assumptions behind the t-test are approximately correct, however.

One assumption is that the distribution of days of illness in each group is fairly close to normal. Here are stemplots of the data by group (ignoring the ones in neither group):

MTB > stem 'ill';
SUBC> by 'test1'.

Character Stem-and-Leaf Display

Stem-and-leaf of ill       test1 = 0       N  = 8
Leaf Unit = 1.0


    1    0 1
    4    0 333
    4    0 55
    2    0 7
    1    0 
    1    1 
    1    1 2


Stem-and-leaf of ill       test1 = 1       N  = 8
Leaf Unit = 1.0


    1    0 1
   (4)   0 2233
    3    0 455

These distributions look approximately normal, except for one one value of 12 days in the group fed no grain. Common sense tells us that we might expect to at least occasionally get animals that are sick for many days in any group, so we can expect that both distributions are at least somewhat skewed to the right. With only 8 cases in each group, this is cause for some concern that the p-value for the t-test might be somewhat inaccurate, but in my opinion this worry isn't serious enough to completely ignore the results, though it is a reason not to put a great deal of trust in them.

The other assumption behind the t-test is that the observations in each group are independent of other observations, both in the same group, and in the other group. We can't tell whether this is true from looking at the data. Common sense tells us that the animals might catch diseases from each other, and that they might all be affected in the same way by factors such as the weather, both of which could lead to a lack of independence. Perhaps the researchers conducting the experiment would be able to say how much of a problem this is likely to be in practice, given the circumstances of the experiment.

I did another t-test to address the question of whether feeding lots of grain (1.5kg or 2.0kg) causes illness, compared to feeding a smaller amount of grain (0.5kg or 1.0kg). For this test, I created a column called 'test2', set to '0' for the 16 animals fed 0.5kg or 1.0kg of grain, to '1' for the 16 animals fed 1.5kg or 2.0kg of grain, and to '*' for the animals fed no grain. I then compared the days of illness for the two groups as follows:

MTB > twot 'ill' 'test2'

Two Sample T-Test and Confidence Interval

Two sample T for ill

test2        N      Mean     StDev   SE Mean
0           16      3.94      2.02      0.50
1           16      5.12      4.66       1.2

95% CI for mu (0) - mu (1): ( -3.84,  1.5)
T-Test mu (0) = mu (1) (vs not =): T = -0.94  P = 0.36  DF = 20

The two-side p-value of 0.36 provides essentially no evidence against the null hypothesis that the mean days of illness is the same for the two groups. The assumption of independence behind this test could again be doubted, as explained above. The distributions for of the data in the two groups can again be viewed via stemplots:

MTB > stem 'ill';
SUBC> by 'test2'.

Character Stem-and-Leaf Display

Stem-and-leaf of ill       test2 = 0       N  = 16
Leaf Unit = 1.0


    2    0 01
    6    0 2233
   (7)   0 4445555
    3    0 677


Stem-and-leaf of ill       test2 = 1       N  = 16
Leaf Unit = 1.0


    1    0 1
    8    0 2223333
    8    0 45555
    3    0 6
    2    0 
    2    1 
    2    1 
    2    1 
    2    1 67

As with the first t-test, there are reasons (both common sense and from looking at the plots above) to belief that the distributions are somewhat skewed. With 16 animals per group, this is less of a worry than for the first test, but is still cause for some concern.

One animal fed 1.0kg of grain died, perhaps quite quickly, since it is recorded as having been ill for zero days. I retained this animal for the tests above. Zero days illness is less than any other animal, but some other animals were ill only one day, so it's not that extreme an observation.

Effect of grain on weight gain

First, I'll look again at the simple regressions that were also done for Assignment #1, this time computing confidence intervals for the slope of the regression line.

Here is the regression of final weight ('ewt') on grain fed ('grain'):

The regression equation is
ewt = 243 + 36.6 grain

38 cases used 2 cases contain missing values

Predictor        Coef       StDev          T        P
Constant      243.295       6.395      38.04    0.000
grain          36.600       5.176       7.07    0.000

S = 23.15       R-Sq = 58.1%     R-Sq(adj) = 57.0%

Analysis of Variance

Source            DF          SS          MS         F        P
Regression         1       26791       26791     49.99    0.000
Residual Error    36       19292         536
Total             37       46084

There were 38 cases, after one animal was ignored because of a data recording error, and one was ignored because it died. The confidence interval for the slope (the regression coefficient of 'grain') is therefore found using the t-distribution with 38-2=36 degrees of freedom. From Table D in the book, the critical value for a 95% confidence interval is about 2.03, interpolating between the values given for 30 df and 40 df. The confidence interval for the slope can therefore be computed as

(36.6 - 2.03*5.176, 36.6 + 2.03*5.176) = (26.1, 47.1)

Here is the regression of change in weight (called 'cwt', computed as 'ewt'-'swt') on grain fed:

The regression equation is
cwt = 79.8 + 42.7 grain

37 cases used 3 cases contain missing values

Predictor        Coef       StDev          T        P
Constant       79.838       4.900      16.29    0.000
grain          42.658       3.914      10.90    0.000

S = 17.05       R-Sq = 77.2%     R-Sq(adj) = 76.6%

Analysis of Variance

Source            DF          SS          MS         F        P
Regression         1       34525       34525    118.78    0.000
Residual Error    35       10173         291
Total             36       44698

Unusual Observations
Obs      grain        cwt         Fit   StDev Fit    Residual    St Resid
 22       1.00     157.00      122.50        2.80       34.50        2.05R 

R denotes an observation with a large standardized residual

There are only 37 cases here because of a data recording error for 'swt', which makes 'cwt' unknown for that animal. The t-distribution with 37-2=35 degrees of freedom should be used for the confidence interval, but the difference in the critical value from before is negligible. The confidence interval for the slope is found to be

(42.658 - 2.03*3.914, 42.658 + 2.03*3.914) = (34.7, 50.6)

The true values for the slopes for these two regressions should be the same. The first represents the effect of grain on the final weight, the second the effect of grain on the change in weight. These effects could be different only if the starting weight were different for animals fed different amounts of grain. But since the animals were assigned to groups fed different amounts of grain randomly, there should be no systematic tendency for animals fed different amounts of grain to have different starting weights. It is therefore not surprising that the two confidence intervals computed above overlap, since we expect that usually they will both include the same true value.

Although the estimated slopes in the two regressions are measures of the same thing (the effect of grain on weight), the second regression appears to be preferable, since it results in a smaller standard deviation for the residuals, hence a smaller standard error for the slope in the regression, and therefore a narrower confidence interval for this slope. This is due to the random variation due to different starting weights being eliminated when we look at the change in weight rather than the final weight. When looking at males and females separately, it therefore makes sense to look at the change in weight rather than the final weight. Here is the regression for males:

The regression equation is
cwt-m = 80.2 + 31.6 grain-m

17 cases used 3 cases contain missing values

Predictor        Coef       StDev          T        P
Constant       80.200       4.707      17.04    0.000
grain-m        31.589       3.735       8.46    0.000

S = 10.53       R-Sq = 82.7%     R-Sq(adj) = 81.5%

Analysis of Variance

Source            DF          SS          MS         F        P
Regression         1      7924.2      7924.2     71.53    0.000
Residual Error    15      1661.7       110.8
Total             16      9585.9

And here is the regression for females:

The regression equation is
cwt-f = 80.5 + 51.7 grain-f

Predictor        Coef       StDev          T        P
Constant       80.468       4.499      17.89    0.000
grain-f        51.682       3.614      14.30    0.000

S = 11.98       R-Sq = 91.9%     R-Sq(adj) = 91.5%

Analysis of Variance

Source            DF          SS          MS         F        P
Regression         1       29381       29381    204.55    0.000
Residual Error    18        2585         144
Total             19       31967

Unusual Observations
Obs    grain-f      cwt-f         Fit   StDev Fit    Residual    St Resid
 12       1.00     157.00      132.15        2.68       24.85        2.13R 

R denotes an observation with a large standardized residual

Note that the standard deviation of the residuals is smaller in both of these regressions than for the regression done on all animals. The difference in slopes (31.6 versus 51.7) seems large compared to the standard errors for these slopes (3.7 and 3.6). This leads me to suspect that the effect of grain is different for males and females, but to do a formal statistical test, we need to combine these two regressions into one.

To do this, I computed the 'fgrain' column as follows:

MTB > let 'fgrain'='sex'*'grain'

I then did a regression of 'cwt' on 'sex', 'grain', and 'fgrain':

The regression equation is
cwt = 80.2 + 0.27 sex + 31.6 grain + 20.1 fgrain

37 cases used 3 cases contain missing values

Predictor        Coef       StDev          T        P
Constant       80.200       5.073      15.81    0.000
sex             0.268       6.624       0.04    0.968
grain          31.589       4.026       7.85    0.000
fgrain         20.093       5.283       3.80    0.001

S = 11.34       R-Sq = 90.5%     R-Sq(adj) = 89.6%

Analysis of Variance

Source            DF          SS          MS         F        P
Regression         3       40451       13484    104.77    0.000
Residual Error    33        4247         129
Total             36       44698

Source       DF      Seq SS
sex           1        3146
grain         1       35443
fgrain        1        1862

Unusual Observations
Obs        sex        cwt         Fit   StDev Fit    Residual    St Resid
 22       1.00     157.00      132.15        2.54       24.85        2.25R 
 30       1.00     135.00      157.99        3.06      -22.99       -2.10R 

R denotes an observation with a large standardized residual

The constant terms and the coefficient for 'grain' match the values found in the regression for males only, which makes sense since for males the other two terms are zero (since both 'sex' and 'fgrain' are zero for males). The constant plus the coefficient for 'sex' matches the constant found in the regression for females only, and the sum of the coefficients for 'grain' and 'fgrain' matches the coefficient of 'grain' in the regression for females only. This multiple regression therefore effectively combines the two separate regressions.

We can now test the null hypothesis that the coefficient of 'fgrain' is zero (ie, that the effect of grain is the same for males and females). The (two-sided) p-value is 0.001, which constitutes quite strong evidence that this null hypothesis is false (ie, that the effect of grain is different for males and females).

This conclusion is valid only if the assumptions behind the computation of this p-value are correct. One assumption is that the residuals are independent. Since the animals were assigned to groups randomly, no dependencies were introduced from this assignment. It is possible that there are dependencies arising from common factors such as the weather. For example, it could be that grain causes a bigger increase in weight for males than for females in hot weather, but it causes a bigger increase in weight for females than for males in cold weather. If it happened to have been cold all summer, the larger effect of grain on females that we see in the data could appear to be statistically significant, even if the average effect of grain on weight - over many summers, some hot, some cold - is the same for males and females. The possibility of such dependencies is reason for caution, but we shouldn't let our ability to imagine such scenarios prevent us from ever coming to any conclusions.

The other assumption behind the computation of the p-value is that the residuals are approximately normally distributed. We can check this using a normal quantile plot. I asked Minitab to store the residuals from the above regression in a new column ('RESI1'), and then produced a normal quantile plot as follows:

MTB > nscores 'RESI1' c18
MTB > plot 'RESI1'*c18

The resulting plot (Postscript, PDF) shows a good match to a normal distribution.

We can therefore be reasonably sure that there is a difference in the effect of grain on males and females - feeding an extra kilogram of grain per day increases weight gain over the summer in females by about 52 kilograms, but it increases weight gain in males by only about 32 kilograms. In contrast, the p-value for the test of the hypothesis that the coefficient of 'sex' is zero is 0.968. Hence we have no reason to believe that males and females gain different amounts of weight (on average) when they are fed no grain.

Finally, I did a regression of the final weight ('ewt') on 'sex', 'grain', 'fgrain', 'swt', and 'age'. Since the starting weight ('swt') is included as an explanatory variable, it is no longer necessary to look at the change in weight rather than the final weight - the effect of random variation in 'swt' will not show up in the residual when 'swt' is an explanatory variable. (The previous technique of looking at 'ewt'-'swt' rather than 'ewt' is in fact equivalent to doing a regression for 'ewt' with 'swt' included as an explanatory variable, but with its coefficient fixed at one.) By including 'age' as an explanatory variable, variation in this variable resulting from random assignment to groups will also be removed from the residual.

Here is the result:

ewt = 215 - 5.34 sex + 28.1 grain + 24.5 fgrain + 1.12 swt - 0.411 age

37 cases used 3 cases contain missing values

Predictor        Coef       StDev          T        P
Constant       215.10       65.85       3.27    0.003
sex            -5.344       7.160      -0.75    0.461
grain          28.052       4.223       6.64    0.000
fgrain         24.543       5.489       4.47    0.000
swt            1.1249      0.1687       6.67    0.000
age           -0.4110      0.2283      -1.80    0.082

S = 10.92       R-Sq = 92.0%     R-Sq(adj) = 90.7%

Analysis of Variance

Source            DF          SS          MS         F        P
Regression         5     42373.4      8474.7     71.11    0.000
Residual Error    31      3694.6       119.2
Total             36     46068.0

Source       DF      Seq SS
sex           1         7.0
grain         1     28016.7
fgrain        1      4810.3
swt           1      9153.3
age           1       386.1

Unusual Observations
Obs        sex        ewt         Fit   StDev Fit    Residual    St Resid
  8       0.00     203.00      207.34        7.63       -4.34       -0.56 X
 22       1.00     310.00      286.91        2.80       23.09        2.19R 
 30       1.00     280.00      301.75        3.10      -21.75       -2.08R 

R denotes an observation with a large standardized residual
X denotes an observation whose X value gives it large influence.

Not much has changed from the previous regression. The standard deviation of the residuals is only slightly smaller, indicating that including 'age' and allowing 'swt' to have a coefficient other than one has not made the predictions much better. The (two-sided) p-value for testing the null hypothesis that the coefficient of 'age' is zero is 0.082, which is only very mild evidence against the null hypothesis. The animals were all about the same age, however, which makes it difficult to a good estimate of the effect of age, so this failure to clearly reject the null hypothesis would not be surprising even if age does have some effect.