STA 248 -- Statistics for Computer Science -- Winter 2005
Exam Information:
Here is the
formula sheet
you will be given on the exam.
Bring a calculator.
You will not be asked to provide R commands on the
exam but there is lots of output from R for you to interpret.
Practice problems from Assignment 4 that you can skip
when preparing for the exam:
Chapter 13: 34, 35, 36, 37, 38, 39, Chapter 15: 17, 22
The exam covers all the material covered in Lectures 1 through
24. The lecture notes are available below. This includes all of the material
covered on assignments 1 through 4 excluding the questions noted
above. Corresponding textbook sections are also noted below
(updated on April 6)
although not everything is in the text.
Last year's exam is available at the bottom of this page.
Solutions for this and all assignment problems are now available (April 14).
Regarding Section 14.1 (two-way ANOVA):
You don't need to memorize any of the formulae in Section 14.1
of the text for the exam. For two-way ANOVA you need to know: the model,
how to interpret the ANOVA table (i.e. what hypotheses are
being tested, how to make conclusions based on the
p-values), how to interpret output from Tukey's procedure
and how to interpret interaction plots. This is
essentially what we covered in lecture, and what you
were asked to do in the assigment questions (including
the practice problems).
Topics covered in lecture that aren't in the textbook (or
aren't in what you might consider the obvious place) but that
you are responsible for knowing on the exam:
trimmed mean, basic idea of the
Central Limit Theorem, normal probability plots,
normal distribution theory (how standard normal, chi-squared,
t, and F distributions are related to each other), data
collection (observational studies and experiments),
bootstrap (estimates of sampling distributions and
standard errors, confidence intervals, one- and two-sample
tests)
For extra help before the exam:
New College Statistics Aid Centre
Tuesday, April 19 10:00-12:00 Alison will be in SS 3103
Wednesday, April 20 2:00-4:00 Alison will be in SS 3103
Thursday, April 21 2:00-4:00 Jeff will be in SS 2131
E-mail Alison with short questions or to make an
appointment at a different time.
NEW! Scanned lecture notes (Lectures 1 through 24) (Thanks to Christian and Sina for providing scans of the overhead transparencies.)
Data files from the text are available here
A running list of what we have covered or are about to
cover from the text:
- All of Chapter 6
- Chapter 7, excluding section 7.3. (The first paragraph
of section 7.3 is worth reading.)
- Chapter 8, excluding sections 8.6, and 8.7
- Chapter 9: Sections 9.1 and 9.2 (and not
sections 9.3 and 9.4)
- Chapter 10, excluding 10.6 and de-emphasizing 10.2
(although we will go over the relevant distribution theory)
- Chapter 13: Sections 13.1 and 13.3 (excluding Duncan's
Multiple Range Test) (excluding section 13.5)
- Chapter 14: Sections 14.1 and 14.2
- Chapter 15: Sections 15.3 and 15.4 excluding McNemar's test
Assignment 4 in pdf
Data for additional problem 5 (text)
You do not have to do part (d) of Chapter 13 Exercise 22.
Reading in the data for Chapter 13 Exercise 6:
Here is one way you can read in the data from the data file
on the web:
1. Read the data into R using the scan command:
amountofmoney <- c(scan("filename"))
This makes a vector of the data values from the file. Note
that it reads across a row, then the next row, etc. You
can look at it by just typing amountofmoney.
2. Create a vector for the size of company groups 1,2,3.
This command will do it:
sizegroup <- rep(c(1,2,3),16)
3. Turn sizegroup into a categorical variable with:
sizegroup <- factor(sizegroup)
The following command will give the anova table:
summary( aov(amountofmoney ~ sizegroup))
To get the variance for just one of the groups (eg. the 1st),
use
var(amountofmoney[sizegroup==1])
Assignment 3 in pdf
Note: Due date is Tuesday, March 22.
R workspace containing data for Additional Problem 1
R workspace containing data for Additional Problem 9
R workspace containing data for Additional Problem 10
An important piece of missing information for Additional Problem
5: It is thought that the standard deviation of a single observation
is approximately 10 under both conditions. Also, use a normal
distribution approximation.
Assignment 2 in pdf
Assignment 1 in pdf
Term test from Winter 2004 in pdf
Solutions to term test from Winter 2004 in pdf
Exam from Winter 2004 in pdf
Solutions to exam from Winter 2004 in pdf
E-mail course instructor: alison.gibbs@utstat.utoronto.ca