Back to Alison Gibbs' Home Page

STA 248 -- Statistics for Computer Science -- Winter 2005

Exam Information:
Here is the formula sheet you will be given on the exam.
Bring a calculator.
You will not be asked to provide R commands on the exam but there is lots of output from R for you to interpret.
Practice problems from Assignment 4 that you can skip when preparing for the exam: Chapter 13: 34, 35, 36, 37, 38, 39, Chapter 15: 17, 22
The exam covers all the material covered in Lectures 1 through 24. The lecture notes are available below. This includes all of the material covered on assignments 1 through 4 excluding the questions noted above. Corresponding textbook sections are also noted below (updated on April 6) although not everything is in the text.
Last year's exam is available at the bottom of this page. Solutions for this and all assignment problems are now available (April 14).

Regarding Section 14.1 (two-way ANOVA):
You don't need to memorize any of the formulae in Section 14.1 of the text for the exam. For two-way ANOVA you need to know: the model, how to interpret the ANOVA table (i.e. what hypotheses are being tested, how to make conclusions based on the p-values), how to interpret output from Tukey's procedure and how to interpret interaction plots. This is essentially what we covered in lecture, and what you were asked to do in the assigment questions (including the practice problems).

Topics covered in lecture that aren't in the textbook (or aren't in what you might consider the obvious place) but that you are responsible for knowing on the exam:
trimmed mean, basic idea of the Central Limit Theorem, normal probability plots, normal distribution theory (how standard normal, chi-squared, t, and F distributions are related to each other), data collection (observational studies and experiments), bootstrap (estimates of sampling distributions and standard errors, confidence intervals, one- and two-sample tests)

For extra help before the exam:
New College Statistics Aid Centre
Tuesday, April 19 10:00-12:00 Alison will be in SS 3103
Wednesday, April 20 2:00-4:00 Alison will be in SS 3103
Thursday, April 21 2:00-4:00 Jeff will be in SS 2131
E-mail Alison with short questions or to make an appointment at a different time.

Course outline in pdf

NEW! Scanned lecture notes (Lectures 1 through 24) (Thanks to Christian and Sina for providing scans of the overhead transparencies.)

Term test solutions in pdf

Data files from the text are available here

R examples from lecture

A running list of what we have covered or are about to cover from the text:
- All of Chapter 6
- Chapter 7, excluding section 7.3. (The first paragraph of section 7.3 is worth reading.)
- Chapter 8, excluding sections 8.6, and 8.7
- Chapter 9: Sections 9.1 and 9.2 (and not sections 9.3 and 9.4)
- Chapter 10, excluding 10.6 and de-emphasizing 10.2 (although we will go over the relevant distribution theory)
- Chapter 13: Sections 13.1 and 13.3 (excluding Duncan's Multiple Range Test) (excluding section 13.5)
- Chapter 14: Sections 14.1 and 14.2
- Chapter 15: Sections 15.3 and 15.4 excluding McNemar's test

Assignment 4 in pdf
Data for additional problem 5 (text)
You do not have to do part (d) of Chapter 13 Exercise 22.

Reading in the data for Chapter 13 Exercise 6:
Here is one way you can read in the data from the data file on the web:
1. Read the data into R using the scan command:
amountofmoney <- c(scan("filename"))
This makes a vector of the data values from the file. Note that it reads across a row, then the next row, etc. You can look at it by just typing amountofmoney.
2. Create a vector for the size of company groups 1,2,3. This command will do it:
sizegroup <- rep(c(1,2,3),16)
3. Turn sizegroup into a categorical variable with:
sizegroup <- factor(sizegroup)
The following command will give the anova table:
summary( aov(amountofmoney ~ sizegroup))
To get the variance for just one of the groups (eg. the 1st), use
var(amountofmoney[sizegroup==1])

Assignment 3 in pdf
Note: Due date is Tuesday, March 22.
R workspace containing data for Additional Problem 1
R workspace containing data for Additional Problem 9
R workspace containing data for Additional Problem 10
An important piece of missing information for Additional Problem 5: It is thought that the standard deviation of a single observation is approximately 10 under both conditions. Also, use a normal distribution approximation.

Assignment 2 in pdf

Assignment 1 in pdf

Solutions to assignments

Term test from Winter 2004 in pdf
Solutions to term test from Winter 2004 in pdf

Exam from Winter 2004 in pdf
Solutions to exam from Winter 2004 in pdf

E-mail course instructor: alison.gibbs@utstat.utoronto.ca