Stat 141 11/2/06 One way anova and post-hoc multiple pairwise comparisons recap: 2-group sample size calculations, width of CI (ovr) ------------------------------------ One-way classifications: 2 or more groups 3-levels of the classification variable, 12 replications (months) per cell Mental Hospital Admissions During Full Moons admission rates to the emergency room of a Virginia mental health clinic before, during and after the 12 full moons from August 1971 to July 1972. Variable Description Month Month of year: Aug, Sep, ... Jul Moon Before, During or After the full moon Admission Admission rate (patients/day) > mental = read.table(file="D:\\drr06\\stat141\\Lect12\\fullmoon.txt", header = T) > attach(mental) > summary(mental) Month Moon Admission Apr : 3 After :12 Min. : 5.000 Aug : 3 Before:12 1st Qu.: 8.475 Dec : 3 During:12 Median :12.850 Feb : 3 Mean :11.931 Jan : 3 3rd Qu.:14.000 Jul : 3 Max. :25.000 (Other):18 (see plots from web page) http://www.statsci.org/data/general/fullmoon.html > tapply(Admission, Moon, summary) $After Min. 1st Qu. Median Mean 3rd Qu. Max. 5.800 8.875 12.850 11.460 13.350 15.800 $Before 6.40 7.85 10.95 10.92 14.20 15.80 $During 5.00 11.25 13.50 13.42 14.50 25.00 > mentalaov = aov(Admission ~ Moon) > summary(mentalaov) Df Sum Sq Mean Sq F value Pr(>F) Moon 2 41.51 20.76 1.1741 0.3217 Residuals 33 583.40 17.68 # moral here--ignoring month (treating as reps) causes within cell variance to swamp # moon factor (see ozDASL display). Need to get months out of the error term (next week). ------------------------------------------------------------------------------- > #Verzani ex, calories per day, 3 diff months, (I = 3, n=5) )p.316 intro stack etc > may = c(2166,1568, 2233, 1882, 2019) > sep = c(2279, 2075, 2131, 2009, 1793) > dec = c(2226, 2154, 2583, 2010, 2190) > calsperday = stack(list(maycal = may, sepcal = sep, deccal = dec)) > names(calsperday) [1] "values" "ind" > attach(calsperday) > tapply(values, ind, mean) > tapply(values, ind, sd) deccal maycal sepcal deccal maycal sepcal 2232.6 1973.6 2057.4 212.3483 264.2296 178.2437 > calaov = aov(values ~ ind) > summary(calaov) Df Sum Sq Mean Sq F value Pr(>F) ind 2 174664 87332 1.7862 0.2094 Residuals 12 586720 48893 # again cannot reject omnibus Ho > # Chap 11 lamb ex corn weights no difference; do spine extension p.483, prob 11.13 > # data from Oberlin college, year 2000 chap 11 notes > dance = read.table(file="D:\\stat141\\spine.dat", header = T) > tapply(SpineEx, Group, mean) > tapply(SpineEx, Group, sd) aerobics control modern aerobics control modern -0.1750000 0.1388889 0.9750000 0.7997395 0.5743354 0.8616038 > danceaov = aov(SpineEx ~ Group) > summary(danceaov) Df Sum Sq Mean Sq F value Pr(>F) Group 2 7.0357 3.5178 6.0667 0.006882 ** Residuals 26 15.0764 0.5799 > qf(.95,2,26) #checks with SW table 10, p.688 [1] 3.369016 > TukeyHSD(danceaov) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = SpineEx ~ Group) $Group diff lwr upr control-aerobics 0.3138889 -0.55552260 1.183300 modern-aerobics 1.1500000 0.30377700 1.996223 modern-control 0.8361111 -0.03330037 1.705523 > oneway.test(SpineEx ~ Group) One-way analysis of means (not assuming equal variances) data: SpineEx and Group F = 4.9288, num df = 2.000, denom df = 17.126, p-value = 0.02038 ------------------------------------------------------------------------------ you already have been doing one-way anova; revisit t-test examples > plant = read.table(file="D:\\stat141\\ancy.dat", header = T) > attach(plant) > oneway.test(height~group) One-way analysis of means (not assuming equal variances) data: height and group F = 3.9755, num df = 1.000, denom df = 12.783, p-value = 0.06795 > plantaov = aov(height~group) > summary(plantaov) Df Sum Sq Mean Sq F value Pr(>F) group 1 89.572 89.572 3.9677 0.06781 . Residuals 13 293.477 22.575 --- > # p-values and square of t-statistics match Welch (oneway) and pooled (aov) 10/26