Statistics 166/366

Statistical Models in Biology

Spring 2010   ●   Mon, Wed 11:00AM - 12:15PM       McCullough 122

Professor Nancy Zhang


Announcements

Final project presentations will be Friday, June 4, 3-6 pm.  Click here for the schedule of presentations.  The location is Sequoia Hall 200.

Contacts

Office / Office hours

Instructors

Nancy Zhang (nzhang)

Sequoia 141, 1:30-2:30 PM Mondays

 

TA

Hao Chen

(haochen)

Sequoia 231,  4-5 PM Wednesdays

Pre-requisite

Basic probability and statistics at the level of Stats 116 and Stat 200.

Tentative Syallabus  (Links will work after the lecture)

Week

Date

Topic

Reading

Slides / code

1

1

29-Mar

Introduction, course logistics

Stat & Prob primer [by Woolfe et al.]

Slides, notes

2

31-Mar

EM

Dempster et al. JRSSB, 1977.

Slides | R code

 

3

5-Apr

Estimating isoform expression.  Guest lecturer: Hui Jiang, Stanford Univ.

Jiang and Wong, Bioinformatics, 2009.

Slides

4

7-Apr

Hidden Markov Models

HMM tutorial by Rabiner

Slides

3

6

14-Apr

HMM example I:  DNA copy number estimation

Fridlyand et al., JMV, 2004.

Lai, Xing and Zhang, Biostatistics, 2007.

Slides

5

12-Apr

HMM example II: fastPHASE

Scheet & Stephens, AJHG, 2006.

Slides

4

7

19-Apr

Monte Carlo integration, rejection method

Slides | R code

8

21-Apr

Rejection method example: coalescent

Tavare et al. Genetics, 1997.

Slides

5

9

26-Apr

Metropolis-Hastings algorithm

Chib & Greenberg, American Statistician, 1995.

Slides

10

28-Apr

Gibbs sampling

Casella & George, American Statistician, 1992.

Slides, R code

6

11

3-May

Metropolis-Hastings example

Pritchard et al., Genetics, 2000.

Slides, movie

12

5-May

Gibbs sampling example

Liu et al., JASA, 1995.

Slides, notes

7

13

10-May

Network models in biology Guest lecturer: Jie Peng, UC Davis

Peng et al., AOAS, 2009.

Slides, notes

14

12-May

Network models in biology.  Guest lecturer: Haiyan Huang, UC Berkeley

Huang et al., PNAS, 2010

Slides

8

13

17-May

Bootstrap

Efron & Tibshirani. Stat. Sci, 1986.

Slides, R code, mammal.txt

14

19-May

Bootstrap example: phylogenetic analysis

Felsenstein, Evolution, 1985.

Slides

9

15

24-May

Scan statistics for genome-wide profiling.

Zhang, 2010, Frontiers in Computational and Systems Biology

Slides

16

26-May

Multi-sample scan statistics and data integration.

Zhang et al., Bioinformatics 2010.

Slides

10

19

31-May

Project presentations

20

02-June

Project presentations

 

Textbook

There are no required texts for this class. Reading materials to complement the lectures will be posted here or distributed in class. Below is a partial list of books that covers some of the topics at a more advanced level.

·  Statistical Analysis with Missing Data by Little & Rubin

·  Computational Statistics by Givens & Hoeting

·  An Introduction to the Bootstrap by Efron & Tibshirani

·  Monte Carlo Strategies in Scientific Computing by Jun Liu

Course requirements

There will be three assignments, and one final projects.  All will require some programming with R.

Homework 1. (Due April 26)    Solutions

Homework 2. (Due May 10)     

Homework 3. (Due May 26)   X.txt, XY.txt

Final Project  Guidelines (Due June 4, 3 pm)

 

Grading

Homeworks:

60%

Final Project:

40%