­Statistics 203:  Introduction to Regression Models and ANOVA

  Winter 2010    ●   Tu Th      9:30-10:45 PM   ●   Cummings Art Building Room 2

Nancy R. Zhang   ●  nzhang atstanford ●   Office Hours: Tu, Thur 2:15-3:15Sequoia Hall 141   


A N N O U N C E M E N T S

3/11: Here is the final exam.  The data sets are here.

3/4:  To convert binomial table to a 0-1 bernoulli table, try this script (it uses the NFL data).

2/26: Note the change in due date for HW 4.

2/19: Clarifications for HW3:  Problem 5 (RABE 11.7): (a) Start with any model with 19 predictors.  (d) Use data up to 1992 to fit your model, and predict the year 1996.

2/18: Problem set 2 has been graded and placed in the 203 box in second floor Sequoia Hall.

1/26: The midterm is next Thursday, Feb 4.  It will be in-class, open book and open note.  During office hours next Tuesday 2:15-3:15 I will hold a review session, at the Girshick Library in downstairs Sequoia Hall.

1/19: Lecture was cancelled due to power outages on campus.  The lecture slides are posted.  We will pick up on Thursday.  Due date for HW1 is extended to next Tuesday, 1/26.

This year’s R introductory session slides by Pei He.

The syllabus has been updated with textbook references.

Note the change in classroom to Cummings Art Building Room 2.

Note the change in Nancy’s office hours to Tu Thur 2:15-3:15, and the updating of the TA’s office hours.

An R introductory session will be Friday, 1/15 4:00-5:15 PM in the Sequoia Hall Computer Lab (Room 211). 

        R introductory session slides by Yueh Wen Liao.

C O U R S E    D E S C R I P T I O N

This course introduces statistical regression models and ANOVA. 

We will cover the basic concepts behind these models, and apply them to the analysis of data sets.    Please see the syllabus for more information.

P R E R E Q U I S I T E S

Basic probability and statistics at the level of Stat 200 and Stat 116.   Basic linear algebra.

T A

Yunting Sun (yunting.sun  at  gmail) Office hours: 12:30-1:30 PM Friday, Sequoia Hall 244

Pei He (hepei at stanford)  Office hours: 11 am-12 pm Thursday, Sequoia Hall 244

 

T E X T B O O K S

RABE: Chatterjee and Hadi, Regression Analysis by Example, 4th Edition (Required)

KNNL: Kutner et al., Applied Linear Statistical Models, 5th Edition (Reference)

Weisberg, Applied Linear Regression, 2nd Edition (Reference)

D A T A

Data sets used in this class are here.

T E N T A T I V E    S Y L L A B U S (Materials will be posted here after every lecture.)

I follow the book very loosely.  The course slides will be your best reference.  Some lectures expose material not in RABE.  For example, the lectures on fixed and random effects come mostly from KNNL, and those on model selection incorporate recent developments not in either textbook.  The sections numbers from the books are listed for reference.  This schedule is tentative and may be adjusted to students’ needs during the quarter.

Date

Materials

Tu 1/5

Review.  Slides, R examples.

Th 1/7

Simple linear regression (RABE 2).  Slides, R examples.

Tu 1/12

Inference, diagnostics for linear regression (RABE 4.1-4.11). 

Th 1/14

Inference, diagnostics for linear regression (RABE 4.1-4.11).  Slides, R examples

Tu 1/19

Multiple regression, constraints, predictions. (RABE 3) Slides, R examples

Th 1/21

 (Slides, R examples are from Tuesday’s lecture)

Tu 1/26

Multiple diagnostics, ANOVA, (RABE 4.11-4.13, 5)

Slides, R examples

Th 1/28

Fixed and Random effects  (KNNL 25) Slides, R examples

Tu 2/2

Random and mixed effects  (KNNL 25) Slides, R examples

Th 2/4

Midterm 

Tu 2/9

Weighted least squares, Variable transformations, PCA.  (RABE 7, 9.4-9.5) Slides, R examples

Th 2/11

PCA.   Slides, R examples

Tu 2/16

Model selection: step-wise procedures.  (RABE 11)

Slides, R examples

Th 2/18

Model selection: ridge, LASSO, and LARS  (RABE 11)  Slides, R examples, Hesterberg et al. review

Tu 2/23

Logistic regression. (RABE 12)  Slides are continuing on last lecture, R examples

Th 2/25

Logistic regression.  (RABE 12) Slides, R examples

Tu 3/2

Contingency tables.  (KNNL 14.13) Slides, R examples

 

Th 3/4

 Contingency tables.  (KNNL 14.13) Slides, R examples

Tu 3/9

Regression with correlated errors (time series) (RABE 8,9).   Slides, R examples

A S S I G N M E N T S

Assignments need to be handed in at the beginning of lecture on the due date.  Solutions are posted the following day.  You can be late by at most one day on at most 1 problem set, with deduction of 10% on grade.

Due date

File

Solutions

Thurs, 1/26

Problem set 1

Problem set 1 solutions

Tues, 2/4

Problem set 2

Problem set 2 solutions

Tues, 2/23

(Tentative)

Problem set 3

Problem set 3 solutions

Tues, 3/9

(Tentative)

Problem set 4

Problem set 4 solutions

R

We will be using R for most of the data analysis in this class.  R can be freely downloaded here

G R A D I N G

Homeworks

40%

Midterm (in class)

20%

Final problem set  (take home)

40%