Solution to STA 250 Assignment #1 (Fall 1999), Second Data Set

Here is a model solution to Assignment #1 (Postscript, PDF), for one particular set of data (and here is another). I have written the report to follow more-or-less the steps I went through in analysing the data, so that you can see what I did, though this wouldn't always be the best way to write a report like this. The links below get you to the data, Minitab worksheets, and Minitab plots. The worksheets and plots can be accessed in either Postscript or PDF formats. The Postscript is best viewed on CQUEST by choosing "0.707" from the "1.000" menu.

Initial examination of the data

I read the original data into Minitab, and then gave the columns appropriate names, producing a Minitab worksheet with the original data (Postscript, PDF).

I then looked at stemplots for the 'age', 'swt', 'ill', and 'ewt' variables, and found several implausible values. In case 27, 'age' was 37 days, which isn't possible since the animals were supposed to all be about a year old. In case 38, 'swt' was 18kg, which is not possible for cattle of this age. In case 5, 'ill' was 118 days, which is not possible because the experiment ran for only 100 days. I replaced all these erroneous values by "*" so that Minitab will ignore these cases whenever the variable in question is needed.

Death and illness

Five of the animals died during the experiment, all of them in group 5, which was fed the largest amount of grain (2.0kg). This raises the question of whether feeding the cattle grain affected their health. It also presents a problem of what to do with these dead animals when analysing the effect of grain on the animals' final weight.

I looked at side-by-side boxplots of the number of days of illness for the five groups fed different amounts of grain (Postscript, PDF). The animals not fed any grain seem to have been healthier than the animals that were fed grain, but from this plot it seems to make little difference how much grain was fed (only whether they were fed any grain or not). However, the plot may be misleading, because for the dead animals, what is recorded is the number of days of illness before they died, which could be small if they died early. Unfortunately, no record is available of when they they died, so we can't tell what proportion of the days they were ill when they were alive. This may have the effect of making the amount of illness in group 5 seem less than it really was.

To try to get a better picture, I produced side-by-side boxplots of the number of days of illness for each group with the dead animals excluded (Postscript, PDF). From these plots, it seems plausible that there is an increasing tendency to illness the more grain the animals are fed, rather than there being just a difference between animals fed no grain and those fed some grain.

Since the animals' health is clearly affected by feeding them grain, the analysis of how feeding grain affects the animals' final weight will have to account for this.

Unstacking the data

So that I could produce the boxplots for just the animals that lived that were mentioned above, and also produce many of the plots mentioned below, I used the unstack command to create two new sets of columns, one set containing data on only the animals that lived, and the other set containing data on only the animals that died. The columns with data on only live animals were called 'sex-alive', 'age-alive', 'swt-alive', 'group-alive', 'grain-alive', 'ill-alive', and 'ewt-alive'.

Effect of feeding grain on weight, looking at all animals

One way of trying to account for the fact that some animals died, probably for reasons relating to how much grain they ate, is to just ignore the problem, and treat these animals as having a final weight of zero, which is the number recorded for them in the 'ewt' variable. Since the value to the farmer of live animals is approximately proportional to their weight, and the value to the farmer of dead animals is zero, this makes at least some sense.

I looked at a scatterplot of 'ewt' vs. 'grain' for all the animals, with the regression line for 'ewt' vs. 'grain' plotted (Postscript, PDF). The regression equation found was

ewt = 252 - 45.5 grain
The negative slope of the regression line would seem to indicate that feeding the animals grain actually reduces the farmer's revenue (which is proportional to 'ewt'), when the dead animals are included. However, the plot shows that the relationship is not very close to a straight line. The dead animals in group 5 have a big influence on the regression line, which then doesn't fit the data very well elsewhere.

Another approach is to look at the change in weight of the animals. I called this 'cwt', and computed it as 'ewt' - 'swt'. (I computed 'cwt-alive' as 'ewt-alive' - 'swt-alive' too, for later use.) Looking at how 'grain' affects 'cwt' has the same significance to the farmer as looking at how it affects 'ewt', because the farmer's actions in feeding grain or not can only influence the final weight by influencing the change in weight, since 'swt' is already fixed by this time.

I looked at a scatterplot of 'cwt' vs. 'grain' for all animals, with the regression line shown as well (Postscript, PDF). The regression equation is

cwt = 94 - 33.0 grain
The slope is negative, as it was for the regression for 'ewt'. The plot shows large negative values of 'cwt' for the dead animals, as expected, and it also shows one point for which 'cwt' is zero. This point is for case 20, in which 'swt' and 'ewt' are both 147. It seems unlikely that this animal gained no weight, and furthermore had exactly the same weight after the experiment as before. Probably the value for 'ewt' was accidently set to be the same as for 'swt'. I therefore replaced 'ewt' for this animal by "*", and set the values for this animal in all the columns (such as 'cwt') that were computed using 'ewt' to "*" as well. I then redid the regression of 'cwt' on 'grain' (Postscript, PDF), obtaining the regression equation
cwt = 95 - 32.9 grain
It is clear from the plot, however, that this line does a very poor job of representing the relationship of 'cwt' to 'grain' for the live animals. Although the negative slope in this regression may be a valid indication that the overall effect of feeding grain is harmful, it fails to provide insight into the separate effects of feeding grain on weight gain and of feeding grain on health.

Effect of feeding grain on weight, live animals only

It seems better to look at how feeding grain affects the weight of the animals that live, and to separately note that feeding grain also affects how likely the animals are to live. I therefore made a scatterplot of 'cwt-alive' vs. 'grain-alive', in which only live animals were included, with the regression line shown as well (Postscript, PDF). The regression equation found was

cwt-alive = 69 + 22.1 grain-alive
The standard deviation of the residuals was 10.8.

This indicates that for animals that live, feeding grain increases the amount of weight they gain. I looked at a scatterplot of 'cwt-alive' vs. 'grain-alive' with males and females marked with different symbols (Postscript, PDF) in order to see if the relationship was different for males and females. No effect of sex is visible.

Relationships among illness, weight gain, and amount of grain fed

I looked at a scatterplot of 'cwt-alive' vs. 'ill-alive' to try to see how weight gain is affected by illness, for animals that lived (Postscript, PDF). This plot shows a positive relationship, as if being ill increased weight gain! This could be misleading, however. We know that feeding grain increases both weight gain and the amount of illness, so the positive relationship seen between 'cwt-alive' and 'ill-alive' could just be a result of those two relationships.

To get a better picture of how illness affects weight gain, I plotted the residuals of the regression of 'cwt-alive' on 'grain-alive' against the value of 'ill-alive' (Postscript, PDF). No relationship is visible, indicating that illness itself does not affect the weight gained, either positively or negatively. This is a bit surprising, since one might expect illness to have a negative impact on weight gain, but if so, the effect is too small to see amid the chance variation.

Modified worksheet

For reference, here is the worksheet with outliers deleted, with the 'cwt' column set to 'ewt'-'swt', and with the separate columns for animals that lived and died: Postscript, PDF.

Conclusions

Feeding grain to the animals has a bad effect on their health, sometimes leading to death. Even feeding 0.5kg of grain each day (the smallest amount tested) seems to increase illness, compared to animals fed no grain. Feeding lots of grain (1.5kg to 2.0kg) seems to have an even worse effect on the health of the animals. One possible explanation is that the grain is contaminated with some toxic compound or with some disease-causing organism. This would explain why feeding even a little bit of grain (0.5kg per day) has a bad effect. All five of the animals in this experiment who died had been fed 2.0kg of gain per day, but since illness increased even with smaller amounts of grain, it seems quite possible that smaller amounts increase the chance of death as well.

For those animals that live, grain increases the amount of weight they gain. For each extra kilogram of grain fed per day, the final weight of the animal is increased by about 22 kilograms. No differences were seen between males and females in this regard. Illness also seems to have no effect on weight gain, provided the animal lives.

These conclusions apply only to animals of the sort used in this experiment, and which (as in this experiment) are approximately one year old at the beginning. The fact that feeding grain led to illness indicates that there may have been something wrong with the grain used in the experiment, which means that the results might have been quite different if grain that wasn't contaminated were used.

It is difficult to give advice to the farmers on the basis of this experiment. Provided their animals live, they will gain more weight if fed grain, but the loss from some of the animals dying when fed grain may outweigh this benefit. However, if the illness is due to contamination of the grain, it would not be a problem if uncontaminated grain can be obtained.