This course will look at how statistical computations are done, and how to write programs for statistical problems that aren't handled by standard packages. Students will program in the S language, which will be introduced at the start of the course. Topics will include the use of simulation to investigate the properties of statistical methods, matrix computations used to implement linear models, optimization methods used for maximum likelihood estimation, and numerical integration methods for Bayesian inference. The course will conclude with a look at some more specialized statistical algorithms, such as the EM algorithm for handling missing data and latent variables, and Markov chain Monte Carlo methods for Bayesian inference.
Radford Neal, Phone: (416) 978-4970,
Office Hours: Wednesdays 2:10-3:00 and Fridays 3:30-4:00, in SS 6016A
Lectures: Tuesday, Thursday, Friday from 1:10pm to 2:00pm, in SS 2106, from September 12 to December 8.
Book: R. A. Thisted, Elements of Statistical Computing.
Assignments will be done in S-Plus or in R (the free S-Plus lookalike, downloadable from http://lib.stat.cmu.edu/R/CRAN/). Graduate students will use the utstat computer system. Undergraduates will use CQUEST.
Four assignments, each worth 16%
Three in-class tests, each worth 12%
Assignment 1: Handout in Postscript, Solution: S/R program, and its output.
Assignment 2: Handout in Postscript, the data file. Solution: S/R program, and its output.
Assignment 3: Handout in Postscript. Solution: S/R program, and its output.
Assignment 4: Handout in Postscript, plus data set A, and data set B. Solution: S/R program, and its output.
Test 1: Now scheduled for October 19. Here are the topics and study questions for test 1.
Test 2: Now scheduled for November 17. Here are the topics for test 2.
Test 3: Scheduled for December 8.
Permutation test for correlation, and the output
Least-squares regresssion with Householder transformations, plus a test program and its output
Maximum likelihood for no-zero Poisson data
Maximum likelihood for a simple model with two variances
Last year's maximum likelihood assigment: Handout, Solution to part 1, Solution to part 2
Midpoint rule used for Bayesian inference for two Poission models, and the output
Last year's Bayesian integration assignment: Handout, Solution, Output.
Evaluating a double integral with the midpoint rule, and the output.
EM algorithm for censored Poission data, and output of two tests.
Gibbs sampling example: S program, some plots of output, for data of 3.9, 3.6, 3.7.
Metropolis algorithm example: S program. Results of a run on data of 3.9, 3.6, 3.7 with proposal standard deviations of 0.1: scatterplot, simulation trace. Results of a run on the same data with proposal standard deviations of 1: scatterplot, simulation trace.