Education 257 SOLUTIONS HW1 1/23/05 1. One-way classification NWK Chap 16, prob 16.12 (Ver4) ; Chap 16 prob 16.9 (Ver5) a) step 1 read in data from knee.dat (here done in Minitab) MTB > read '/usr/class/ed257/HW/knee.dat' c1 c2 24 ROWS READ ROW C1 C2 1 29 1 2 42 1 3 38 1 4 40 1 . . . and then get descriptive statistics by group (1,2,3) MTB > describe c1; SUBC> by c2. C2 N MEAN MEDIAN TRMEAN STDEV SEMEAN C1 1 8 38.00 40.00 38.00 5.48 1.94 2 10 32.00 31.00 31.62 3.46 1.10 3 6 24.00 22.50 24.00 4.43 1.81 C2 MIN MAX Q1 Q3 C1 1 29.00 43.00 32.00 42.00 2 28.00 39.00 29.00 35.00 3 20.00 32.00 20.75 27.50 The group means are 38, 32, and 24 for the below average, average, and above average groups, respectively. Variances are 30.03, 11.97, and 19.62. (Note: The group means and SDs are also displayed under the ANOVA table) b) here are dotplots below, boxplots or stem-and-leafs for the 3 groups would be alternatives. Main point is always look at the data MTB > dotplot c1; SUBC> by c2. C2 1 (below average) . . . : : . -----+---------+---------+---------+---------+---------+-C1 C2 2 (average) . : . : . : . -----+---------+---------+---------+---------+---------+-C1 C2 3 (above average) . . . . . . -----+---------+---------+---------+---------+---------+-C1 20.0 25.0 30.0 35.0 40.0 45.0 These plots illustrate the clustering of the observations in each group about the group means. The small sample sizes make it difficult to detect outliers or heteroskedasticity (unequal group variances), although the observations in the below average group appear to be somewhat more spread out than are those in the other groups. c) run a one-way anova to get the anova table (you could do it by hand from the descriptive statistics in part a) here is Minitab output MTB > oneway c1 c2 resids in c3 fits in c4; SUBC> tukey. (Note: the above command tells Minitab to store the residuals in C3 and the fitted values (which are just the group means) in C4. The words "resids in" and "fits in" are unnecessary; could just write MTB >oneway c1 c2 c3 c4) The tukey subcommand above does the calculations for part e ANALYSIS OF VARIANCE ON C1 SOURCE DF SS MS F p C2 2 672.0 336.0 16.96 0.000 ERROR 21 416.0 19.8 TOTAL 23 1088.0 INDIVIDUAL 95 PCT CI'S FOR MEAN BASED ON POOLED STDEV LEVEL N MEAN STDEV -------+---------+---------+--------- 1 8 38.000 5.477 (----*-----) 2 10 32.000 3.464 (----*----) 3 6 24.000 4.427 (-----*-----) -------+---------+---------+--------- POOLED STDEV = 4.451 24.0 30.0 36.0 The omnibus null hypothesis is Ho: mu(1)=mu(2)=mu(3) We test this against the alternative Ha: not all mu(i) are equal Test statistic is MSB/MSW = 336/19.8 = 16.96. Find critical value F(.95,2,21): MTB > invcdf .95; SUBC> f 2 21. 0.9500 3.4668 (or use NWK table B.4 (ver4 or ver5) Since 16.96 > 3.4668, we reject the omnibus null hypothesis and conclude that there are differences among the three groups. d) Resids are stored in C3 & fits in C4, from oneway command above. MTB > plot c3 c4 - * - * 6.0+ - * C3 - 2 - * 2 2 - * 0.0+ * - * 2 - * * - 2 3 - -6.0+ - - * - * - ----+---------+---------+---------+---------+---------+--C4 25.0 27.5 30.0 32.5 35.0 37.5 We could also plot C3 against C2, or produce aligned dotplots of the residuals for each group. Here's one code example to obtain residuals the long way (remember residuals are just the differences between each observation and the group mean): MTB > unstack c1 c3-c5; SUBC> subscripts c2. MTB > let c6=c3-mean(c3) MTB > let c7=c4-mean(c4) MTB > let c8=c5-mean(c5) MTB > stack c6-c8 c9 MTB > plot c9 c2. The plots suggest that the variability of the observations in the below average group is greater than that for the other groups (the dotplots and a quick look at the descriptive statistics support this). Since the sample sizes are a bit unequal, if one wanted to be very careful, the best analysis here would be to use something like BMDP7D which we illustrated with the IBS data to use a one-way anova method that did not require the equal variance assumption. e. continuing the problem from tukey subcommand in the main analysis MTB > oneway c1 c2 resids in c3 fits in c4; SUBC> tukey. ------------- TUKEY'S multiple comparison procedure Nominal level = 0.0500 Family error rate = 0.0500 Individual error rate = 0.0199 Critical value = 3.56 Intervals for (mean of column group) - (mean of row group) 1 2 2 0.680 11.320 3 7.943 2.208 20.057 13.792 [check by hand from NWK section 17.5 (ver4 or ver5) Just presenting the output above, is not really a complete answer. Much better to say--the confidence interval for mu(1) - mu(2) has endpoints (.68, 11.32) etc etc None of these intervals includes zero, so we conclude that each group means differ from one another. part (f) Knee problem, design considerations Note: remember, for Tukey we don't need to worry splitting up the familywise error rate width = 5 = 2*q(.95,I,dfw)*sqrt(MSW/n) = 2*q(.95,3,dfw)*sqrt(19.8/n) using MSW from previous analysis a little algebra... 2.5 = q(.95,3,dfw)*sqrt(19.8/n) 6.25 = [q(.95,3,dfw)]^2 * 19.8/n n = 3.168 * [q(.95,3,dfw)]^2 Problem is, we don't know exactly what q is without knowing n, because q depends on degrees of freedom within. So we should use any prior information we have to suggest a best guess. The widths of the intervals in our previous analysis were around 11. Since we want to cut this width approximately in half, we'll need to quadruple (approximately) our group sample sizes. So start with n=40 as a best guess, which gives dfw =120-3 =117. Using NWK (ver 4 or 5) Table B.9, q(.95,3,117) = 3.36 (approx.). Therefore n = 3.168 * (3.36)^2 = 36, which is pretty close to our original guess. Anything reasonably close to this number is acceptable. ----------------------------------------------------------- 2. Factorial (2way) anova First order of business is to set up a data file. typical structure is outcomes in c1, row factor c2, column factor c3 a. For profile plot get cell means: MTB > table c2 c3; SUBC> stats c1. ROWS: C2 COLUMNS: C3 1 2 3 ALL 1 6 6 6 18 23.167 28.333 12.833 21.444 3.971 4.320 4.792 7.801 2 6 6 6 18 20.500 25.000 32.833 26.111 4.135 2.898 3.430 6.201 ALL 12 12 12 36 21.833 26.667 22.833 23.778 4.108 3.916 11.175 7.337 CELL CONTENTS -- C1:N MEAN STD DEV Profile plot-- see figures 19.1-19.5 ver4 or Figs 19.3-19.7 ver5 I won't try to draw here. Main effects of both Time of Isolation and Level of Reinforcement are best interpreted keeping in mind the disordinal interaction indicated in the profile plot. The profile plot indicates that recall increases steadily with time in isolation only for verbally reinforced children. For unreinforced children, recall increases from 20 to 40 minutes of isolation, but decreases from the 40 minute level (and falls below the 20 minute level) for 60 minutes of isolation. b. The model for these data is: y(ijk) = mu + alpha(i) + beta(j) + alphabeta (ij) + epsilon(ijk) where mu is the grand mean of all observations. y(ijk) is recall for child k observed within the group defined by reinforcement level i and isolation level j. alpha(i) is the effect of reinforcement at level i. beta(j) is the effect of isolation at level j. alphabeta (ij) is the effect of the interaction between reinforcement at level i and isolation at level j . epsilon(ijk) is a random error part c. The ANOVA table below indicates significant main effects and interaction between the two factors (Time of Isolation and Level of Reinforcement) using overall error rate .05. Factor Type Levels Values C2 fixed 2 1 2 C3 fixed 3 1 2 3 Analysis of Variance for C1 Source DF SS MS F P C2 1 196.00 196.00 12.42 0.001 C3 2 156.22 78.11 4.95 0.014 C2*C3 2 1058.67 529.33 33.55 0.000 Error 30 473.33 15.78 Total 35 1884.22 To get the critical values for the series of 3 hypothesis tests, we use inverse cdf function in computer packages (here minitab) MTB > invcdf .983; SUBC> f 2 30. 0.9830 4.6817 MTB > invcdf .983; SUBC> f 1 30. 0.9830 6.3871 This gives us each test at .017 (c.f. alphatot.tab) We are able to reject the main effects and interaction null hypotheses. ------------------------------------------------------------------ SAS input and output title 1 'SAS anova HW1 problem 2'; data recall; input outcome reinforce isolation; datalines; 26 1 1 23 1 1 28 1 1 19 1 1 18 1 1 25 1 1 30 1 2 25 1 2 27 1 2 36 1 2 28 1 2 24 1 2 6 1 3 11 1 3 17 1 3 10 1 3 14 1 3 19 1 3 15 2 1 24 2 1 25 2 1 16 2 1 22 2 1 21 2 1 24 2 2 29 2 2 23 2 2 26 2 2 27 2 2 21 2 2 31 2 3 29 2 3 35 2 3 38 2 3 34 2 3 30 2 3 ; proc anova data=recall; class reinforce isolation; model outcome = reinforce|isolation; run; -----output---------- 1 SAS anova HW1 problem 2 1 15:13 Sunday, January 23, 2005 The ANOVA Procedure Class Level Information Class Levels Values reinforce 2 1 2 isolation 3 1 2 3 Number of Observations Read 36 Number of Observations Used 36 1 SAS anova HW1 problem 2 2 15:13 Sunday, January 23, 2005 The ANOVA Procedure Dependent Variable: outcome Sum of Source DF Squares Mean Square F Value Pr > F Model 5 1410.888889 282.177778 17.88 <.0001 Error 30 473.333333 15.777778 Corrected Total 35 1884.222222 R-Square Coeff Var Root MSE outcome Mean 0.748791 16.70520 3.972125 23.77778 Source DF Anova SS Mean Square F Value Pr > F reinforce 1 196.000000 196.000000 12.42 0.0014 isolation 2 156.222222 78.111111 4.95 0.0139 reinforce*isolation 2 1058.666667 529.333333 33.55 <.0001 -------------------------------------------------------------- 3. Mixed Model NWK Chap 24 (ver4) Chap 25 (ver5) Model is: y(ijk) = mu + alpha(i) + beta(j) + alphabeta(ij) + epsilon(ijk) where mu is the grand mean of all evaluations. y(ijk) is evaluation k observed within the group defined by level i of the row factor and level j of the column factor. alpha(i) is a random effect of patient at level i. beta(j) is the fixed effect of physician at level j. alphabeta(ij) is a random effect of the interaction between patient and physician at level i of patient and level j of physician. epsilon(ijk) is a random component of error for observation k at level i of patient and level j of physician. -------------------------------------- b. ************So here's the complete data structure ROW eval MD patient 1 7.2 1 1 2 9.6 1 1 3 8.5 2 1 4 9.6 2 1 5 9.1 3 1 6 8.6 3 1 7 8.2 4 1 8 9.0 4 1 9 7.8 5 1 10 8.0 5 1 11 4.2 1 2 12 3.5 1 2 13 2.9 2 2 14 3.3 2 2 15 1.8 3 2 16 2.4 3 2 17 3.6 4 2 18 4.4 4 2 19 3.7 5 2 20 3.9 5 2 21 9.5 1 3 22 9.3 1 3 23 8.8 2 3 24 9.2 2 3 25 7.6 3 3 26 7.1 3 3 27 7.3 4 3 28 7.0 4 3 29 9.2 5 3 30 8.3 5 3 31 5.4 1 4 32 3.9 1 4 33 6.3 2 4 34 6.0 2 4 35 6.1 3 4 36 5.6 3 4 37 5.0 4 4 38 5.4 4 4 39 6.5 5 4 40 6.9 5 4 So now the mixed model anova (here minitab) MTB > anova eval = MD|patient; SUBC> random patient; SUBC> restricted; SUBC> ems. Factor Type Levels Values MD fixed 5 1 2 3 4 5 patient random 4 1 2 3 4 Analysis of Variance for eval Source DF SS MS F P MD 4 3.812 0.953 0.71 0.602 patient 3 180.133 60.044 173.41 0.000 MD*patient 12 16.159 1.347 3.89 0.004 Error 20 6.925 0.346 Total 39 207.028 Source Variance Error Expected Mean Square component term (using restricted model) 1 MD 3 (4) + 2(3) + 8Q[1] 2 patient 5.9698 4 (4) + 10(2) 3 MD*patient 0.5001 4 (4) + 2(3) 4 Error 0.3463 (4) you should check always that the computer package produced the correct test statistics from the anova table (i.e. choosing the denominators according the the indications from the E(MS) table and also implicitly that the p-values reflect using the correct df from the test statistics ). Clearly in these data there is no effect for MD (which could indicate that all doctors are equally good?), there is a large patient effect and also a significant interaction (we should look at the profile plot etc if we were doing more than just using this for practice in setting up the mixed-model analysis.) c. Post-hoc comparisons Now we could the physician means from a table of all 20 cell means MEANS MD N eval 1 8 6.5750 2 8 6.8250 3 8 6.0375 4 8 6.2375 5 8 6.7875 Tukey pairwise comparisons for the physician factor (using number of levels of physician = 5, df(error) use the error term MSinteraction and df(physician*patient) = 12, and thus q = 4.52 at Type 1 error rate = .05, and number of observations per level of physician = 8 -- 2 replications times 4 levels of patient). W = (4.52)*Sqrt[1.347/8] = 1.85. So we can see that all confidence intervals include 0 (so none of the 10 pairwise differences are significantly different from 0). We would expect this result given the anova table above. In real life you wouldn't bother most likely with a post-hoc pairwise comparison for a main effect this seemingly nonexistent. But for the exercise it's worth it..... -------------------------------------------------------- Another look at part (b) "for old times sake" if the computing package doesn't automatically produce test statistics in the mixed-model specification Analysis of Variance for eval Source DF SS MS patient 3 180.133 60.044 MD 4 3.812 0.953 patient*MD 12 16.159 1.347 Error 20 6.925 0.346 Total 39 207.028 Tests for interaction and main effects; (t.s. = test statistic). reference distrib interaction t.s. 1.347/.346 = 3.89 (F 12,20) patient t.s. 60.04/.346 =173.5 (F 3,20) MD t.s. .953/1.347 = .707 (F 4,12) ------------------------------------------------------- ================================================================ 4. Unbalanced designs NWK Ch 22 (ver4) Ch 23 (ver5, prob 23.9) Note that this is a 2 way (4 X 3) anova design, with both factors (subject of course, highest degree earned) fixed and where cell n's are unequal. Part a. A profile plot of the cell means could be drawn with either factor A or factor B on the x-axis, as below (connect the like symbols between columns). Comments about the plots: There appears to be a main effect for subject of course, with earnings per course lowest for Humanities and highest for Engineering courses, with Social Sciences and Management courses at similar intermediate earning levels. This is true regardless of the highest degree earned by the instructor, suggesting no important interaction between the factors. There also appears to be a main effect for highest degree earned, with instructors who have a PhD earning more than those with either BA or MA degrees. Again this is true regardless of subject of the course. No important interaction is visible. 3.80 | | *PhD E 3.60 | A | *PhD R 3.40 | *PhD N | I 3.20 | N | G 3.00 | S | *MA 2.80 | *BA P | *PhD E 2.60 | *BA*MA R | *MA*BA 2.40 | C | O 2.20 | U | R 2.00 | *MA S | E 1.80 | *BA | 1.60 | ------------------------------------------------------------- 1 2 3 4 Humanities Engineering Social Sciences Management Factor A: Subject of course ------------------------------------- 3.80 | | *ENG E 3.60 | A | *SS R 3.40 | *MGT N | I 3.20 | N | G 3.00 | S | *ENG 2.80 | *ENG P | *HUM E 2.60 | *MGT *MGT R | *SS *SS 2.40 | C | O 2.20 | U | R 2.00 | *HUM S | E 1.80 | *HUM | 1.60 | ------------------------------------------------------------- 1 2 3 BA MA PhD Factor B: Highest degree earned Part b. To test H(o): all alpha(i)s equal 0 vs. H(a): not all alpha(i)s equal 0, the test statistic is MSA/MSE . To find these values, first the corresponding Degrees of Freedom obscured in the output must be calculated. We know that a = 4, b = 3 and N(Total) = 45. Source DF A (Subject) 3 B (Degree) 2 AB interactions 6 Error 33 Total 44 (You could also find error DF by subtracting other DFs from total DF: 44 - 3 - 2 - 6 = 33.) (NOTE: For this problem you need not calculate any DFs or Mean Squares except for Factor A (Subject) and Error.) The required Mean Squares obscured in the output can be calculated from the Adjusted Sum of Squares (as you may recall from lecture the Sequential SS are not appropriate here as they depend on the order in which the factors are entered into the GLM procedure; see UNBALANC.LOG). Source Adj. MS = Adj. SS / DF A (Subject) 1.4109 4.2326 / 3 B (Degree) 4.1144 8.2287 / 2 AB interaction 0.0074 0.0444 / 6 Error 0.0218 0.7180 / 33 MSA 1.4109 The test statistics is then ----- = --------- = 64.72 MSE 0.0218 The critical F value F(0.01;3,33) is approximately 4.46 (see NWK table B.4 F(.01;3,30) = 4.51 and F(.01;3,40) = 4.31). For purposes here a critical value about 4.5 (or a touch less) is fine here. Since 64.72 > 4.46 we reject H(o) that all alpha(i)s equal 0. Part c. read in data set MTB > read '/usr/class/ed257/adjprof.dat' into c1-c3 45 ROWS READ ROW C1 C2 C3 1 1.7 1 1 2 1.9 1 1 3 1.8 1 2 4 2.1 1 2 . . . In order to replicate the GLM analysis: MTB > glm c1 = c2|c3 Factor Levels Values C2 4 1 2 3 4 C3 3 1 2 3 Analysis of Variance for C1 Source DF Seq SS Adj SS Adj MS F P C2 3 4.1676 4.2326 1.4109 64.85 0.000 C3 2 8.3825 8.2287 4.1144 189.10 0.000 C2*C3 6 0.0444 0.0444 0.0074 0.34 0.910 Error 33 0.7180 0.7180 0.0218 Total 44 13.3124 Unusual Observations for C1 Obs. C1 Fit Stdev.Fit Residual St.Resid 39 2.30000 2.55000 0.10430 -0.25000 -2.40R 40 2.80000 2.55000 0.10430 0.25000 2.40R R denotes an obs. with a large st. resid. In order to obtain unweighted mean (i.e. ) Miller's approximate solution, (see also unblanced.log course example) first carry out a one-way analysis of variance on the cell means. You can obtain the cell means by issuing the following command. MTB > table c2 c3; SUBC> mean c1; SUBC> count. ROWS: C2 COLUMNS: C3 1 2 3 ALL 1 1.8000 1.9500 2.7000 2.4250 2 2 8 12 2 2.4500 2.5200 3.4500 2.7846 4 5 4 13 3 2.7500 2.8500 3.7400 3.2364 2 4 5 11 4 2.5500 2.5500 3.4200 3.0333 2 2 5 9 ALL 2.4000 2.5385 3.2364 2.8489 10 13 22 45 CELL CONTENTS -- C1:MEAN COUNT Now you want to enter the cell means, the row indicators, and the column indicators into columns so that you can carry out a oneway analysis of variance. (An alternative to typing is to use your editor to cut in the means from above and use set command to create indices in C2 C3) MTB > read data into c11-c13 DATA> 1.80 1 1 DATA> 1.95 1 2 DATA> 2.70 1 3 DATA> 2.45 2 1 DATA> 2.52 2 2 DATA> 3.45 2 3 DATA> 2.75 3 1 DATA> 2.85 3 2 DATA> 3.74 3 3 DATA> 2.55 4 1 DATA> 2.55 4 2 DATA> 3.42 4 3 DATA> end 12 ROWS READ Carry out a twoway ANOVA. MTB > twoway c11-c13 ANALYSIS OF VARIANCE C11 SOURCE DF SS MS C12 3 1.50389 0.50130 C13 2 2.17280 1.08640 ERROR 6 0.01413 0.00236 TOTAL 11 3.69083 In order to obtain the unweighted mean (Miller's) approximate solution, calculate the harmonic mean for this set of data. harmonic mean = [(1/12) x (1/2 + 1/2 + 1/8 + 1/4 + 1/5 + 1/4 + 1/2 + 1/4 + 1/5 + 1/2 + 1/2 + 1/5)]^-1 = 3.0189 (approx.=3) ---------------------------------------------------- or if you like computational overkill, here's an example Harmonic Mean from Mathematica < 15/4, HarmonicMean -> 160/53, Median -> 4} N[%] {Mean -> 3.75, HarmonicMean -> 3.018867924528302, Median -> 4.} ------------------------------------------------- Comparing the GLM and Miller solutions we see: GLM Solution SOURCE DF Adj SS Adj MS F C2 3 4.2326 1.4109 64.85 C3 2 8.2287 4.1144 189.10 Interaction 6 .0444 .0074 .34 Error 33 .7180 .0218 MILLER Solution SOURCE DF SS MS F C12 3 3(1.50389)=4.51167 1.50389 68.99 C13 2 3(2.17280)=6.51840 3.25920 149.50 Interaction 6 3(0.01413)=0.04239 .00707 .32 Error 33 .7180 .0218 (Note: The "error" term we obtained from the two-way analysis of variance on the cell means is actually an estimate of the interaction. Our estimate of SSE for Miller's solution is obtained by taking the sum of the squared within cell deviations (i.e. X(ijk) - X(ij bar)). This procedure gives within-cell error estimates identical to GLM.) As we can see, the GLM solution and Miller's approximate solution yield similar results. At any reasonable Type I error rate, we can reject the null hypotheses of no main effects, for both subject matter and degree earned. Likewise, neither solution offers evidence of an interaction between subject matter and degree earned. This is pretty good considering this unbalanced design is a little outside the guidline of nMax/nMin <= 3 for Miller's procedure. ----------------------------------------------------------------------- 5. Power calcs NWK ver 4 26.7, ver 5 16.26 What is the power of the test in Problem 1(c)? Sample sizes were n_1 = 8, n_2 = 10, n_3=6 Test was carried out with Type I error rate alpha set to .01 For the power calculation we are given: mu_1 = 37 mu_2 = 35 mu_3 = 28 sigma = 4.5 Begin by finding the Weighted Mean following NWK sec 26.4 (also example on p.1055) in Ver4, or section 16.10 pp717-8. Weighted Mean, mu(.) = (8*37 + 10*35 + 6*28)/24 = 33.9167 Next, compute the quantity called phi from NWK p. 1055 phi is a measure of how unequal the mu_i are. phi = (Sqrt[(8*(37 - 33.9167)^2 + 10*(35 - 33.9167)^2 + 6*(28 -33.9167)^2)/3]/4.5) = 2.21418 The degrees of freedom for the omnibus test are 2 and 21 Interpolating Table B.11 to look up the related value of power Power = .70 or .71 -------------------------------------------------- 6. Depression class example, blocking MTB > Read file DEPRESS.DAT describe c1-c2 Descriptive Statistics Variable N Mean Median TrMean StDev SEMean C1 30 6.300 6.000 6.154 2.395 0.437 C2 30 7.567 8.000 7.654 2.596 0.474 Variable Min Max Q1 Q3 C1 3.000 12.000 4.000 8.000 C2 2.000 12.000 5.750 9.250 or "stack" data into 60 rows with outcome group and block as the columns outcomes and treatment-placebo indicator and block identifier (1,...30) Data Display Row C3 C4 C5 1 6 1 1 2 4 1 2 3 6 1 3 4 7 1 4 5 5 1 5 6 6 1 6 7 8 1 7 8 7 1 8 9 8 1 9 10 3 1 10 11 9 1 11 12 4 1 12 13 8 1 13 14 11 1 14 15 12 1 15 16 6 1 16 17 10 1 17 18 3 1 18 19 5 1 19 20 4 1 20 21 6 1 21 22 7 1 22 23 5 1 23 24 6 1 24 25 3 1 25 26 10 1 26 27 5 1 27 28 4 1 28 29 4 1 29 30 7 1 30 31 4 2 1 32 7 2 2 33 12 2 3 34 10 2 4 35 2 2 5 36 11 2 6 37 9 2 7 38 5 2 8 39 11 2 9 40 8 2 10 41 7 2 11 42 6 2 12 43 8 2 13 44 9 2 14 45 9 2 15 46 8 2 16 47 10 2 17 48 9 2 18 49 8 2 19 50 5 2 20 51 8 2 21 52 7 2 22 53 6 2 23 54 9 2 24 55 3 2 25 56 5 2 26 57 11 2 27 58 7 2 28 59 3 2 29 60 10 2 30 So we have a 2x30 design with 1 replication per cell Two-way Analysis of Variance Analysis of Variance for C3 Source DF SS MS C4 1 24.07 24.07 C5 29 237.73 8.20 Error 29 123.93 4.27 Total 59 385.73 model statement needs to tell anova not to fit both an interaction AND an error term For example anova c3 = c4|c5 - c4*c5 anova c3 = c4 c5 Analysis of Variance (Balanced Designs) Factor Type Levels Values C4 fixed 2 1 2 C5 fixed 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Analysis of Variance for C3 Source DF SS MS F P C4 1 24.067 24.067 5.63 0.024 C5 29 237.733 8.198 1.92 0.042 Error 29 123.933 4.274 Total 59 385.733 Analysis of Variance for C3 Source DF SS MS F P C4 1 24.067 24.067 5.63 0.024 C5 29 237.733 8.198 1.92 0.042 Error 29 123.933 4.274 Total 59 385.733 So if we use Type I error rate .05 to test treatment vs placebo we find a significant effect b) calculate how much blocking helps by comparing results with a randomized design (no matching) or could use NWK sec 27.7 (ver4) sec 21.9 (ver5). note ignore blocking One-Way Analysis of Variance Analysis of Variance on C3 Source DF SS MS F p C4 1 24.07 24.07 3.86 0.054 Error 58 361.67 6.24 Total 59 385.73 MSE increases from 4.27 to 6.24 when the blocking structure is ignored, so relative efficiency of blocking here was 1.46, about 1.5. c) paired t-test (t-test one-sample on difference scores for each block) let c10 = c2 -c1 ttest c10 T-Test of the Mean Test of mu = 0.000 vs mu not = 0.000 Variable N Mean StDev SE Mean T P-Value C10 30 1.267 2.924 0.534 2.37 0.024 NOTE the anova test statistic 5.63 = (2.37)^2 these are equivalent. tinterval c10 Confidence Intervals Variable N Mean StDev SE Mean 95.0 % C.I. C10 30 1.267 2.924 0.534 ( 0.175, 2.359) ------------------------------------------------------------- 7. Randomized Block NWK ch 27 (ver4), ch 21 (ver5) a) Statistical model Y(ij) = mu + rho(i) + tau(j) + epsilon(ij) where Y(ij) is the proficiency of the auditor in block i and treatment j mu is the grand mean rho(i) is the main effect for block at level i tau(j) is the main effect for training method at level j epsilon(ij) is random error associated with the observation at level i of the blocking factor and level j of the treatment factor i=1,...,10 j=1,2,3 Assumptions: mu is a constant rho(i) are constants that sum to zero tau(j) are constants that sum to zero epsilon(ij) are independent normal (0,sigma-sq) The model generally assumes no block*treatment interaction b) Profile plot Here is the treatment*block table: 1 2 3 ALL 1 73.000 81.000 92.000 82.000 2 76.000 78.000 89.000 81.000 3 75.000 76.000 87.000 79.333 4 74.000 77.000 90.000 80.333 5 76.000 71.000 88.000 78.333 6 73.000 75.000 86.000 78.000 7 68.000 72.000 88.000 76.000 8 64.000 74.000 82.000 73.333 9 65.000 73.000 81.000 73.000 10 62.000 69.000 78.000 69.667 ALL 70.600 74.600 86.100 77.100 Plot--blocks 1-10 labeled A-J - - A 90+ BD - CEG - F - - A HI 80+ - BD J - BCE CF - ADF HI - EG 70+ J - G - I - H - J --------+---------+---------+---------+---------+-------- 1.20 1.60 2.00 2.40 2.80 The plot shows a clear main effect for method; values increase as j=1,2,3 on horizontal axis increases. There may be seen some evidence of disordinal interaction between method and blocks; but it's hard to tell whether it's large enough to be concerned about. A Tukey 1df for non-addditivity may be a useful sidecheck here (but since method is so highly significant, we need not be so worried about an interaction slightly inflating the error term). c) ANOVA table and test for main effect MTB > anova c1=c3 c2 Factor Type Levels Values method fixed 3 1 2 3 block fixed 10 1 2 3 4 5 6 7 8 9 10 Analysis of Variance for profmeas Source DF SS MS F P method 2 1295.00 647.50 103.75 0.000 block 9 433.37 48.15 7.72 0.000 Error 18 112.33 6.24 Total 29 1840.70 Test for main effect for training method Ho: tau(1)=tau(2)=tau(3)=0 vs. Ha: not all tau(j)=0 Test statistic = MSTR/MSW = 647.50/6.24 = 103.75 Critical value = F(.99,2,18) = 6.0129 Reject Ho; there is a significant effect of training method. d) Tukey pairwise comparisons for treatment means The half-width of the Tukey interval is (see NWK sec 27.5 ver 4, sec 21.5 ver5) sqrt(MSW/#blocks)*q(1-alpha, #treatments, dfw) = sqrt(6.24/10)*q(.99,3,18) = sqrt(6.24/10)*4.70 = 3.71 So the interval estimates are For mu(1)-mu(2): 70.6-74.6 +/- 3.71 = (-7.71,-.29) For mu(2)-mu(3): 74.6-86.1 +/- 3.71 = (-15.21,-7.79) For mu(1)-mu(3): 70.6-86.1 +/- 3.71 = (-19.21,-11.79) None of these intervals contains zero, indicating significant differences between all pairs of treatment means. Method 3 seems clearly superior-- scary as it sounds there may be some reason to go to Chicago. e) Usefulness of blocking The easiest way to do this is to run the the one-way ANOVA without the blocking factor and compare the MSW from this with the MSW above. MTB > oneway c1 c3 ANALYSIS OF VARIANCE ON profmeas SOURCE DF SS MS F p method 2 1295.0 647.5 32.04 0.000 ERROR 27 545.7 20.2 TOTAL 29 1840.7 MSW increases from 6.24 to 20.2 when blocks are not used, so the relative efficiency of blocking is 20.2/6.24 = 3.24. So we would require at least 3 times as many subjects to achieve the same precision for investigating differences between the 3 methods with a design that did not use blocking-- 90 subjects or more. --------------------------------------------------------------- END 2005 HW1 SOLUTIONS