HW 6 solutions 1a) Total number of experimental units: 8 x 3 = 24. Total number of treatments: 3 x 4 = 12. 2 replications per treatment (r). Source df calculation Model 11 (12-1) Error 12 (23-11) Total 23 (24-1) Source df calculation Pesticide 3 (4-1) Variety 2 (3-1) Pest*var 6 (3*2) Notice: df(model) = df(pest + var + pest*var) ; 11 = 3 + 2 + 6. 1) For b, c, and d we need the following mean squares for calculating F statistics. Note: MS = SS/df Source df SS MS Model 11 6680.458333 607.3144 Error 12 507.5 42.29167 Total 23 7187.958333 Source df SS MS Pesticide 3 2227.458333 742.4861 Variety 2 3996.083333 1998.042 Pest*var 6 456.916667 76.15278 1b) The test for an interaction is F = MS(pest*var)/MS(error). F = 76.15278/42.29167 = 1.800657. df are 6 and 12. p-value > 0.1 F(6,12,0.1) critical value is 2.39. There is no evidence of a significant interaction. 1c) The test for the presence of pesticide main effects is F = MS(pest)/MS(error). F = 742.4861/42.29167 = 17.55632. df are 3 and 12. p-value < 0.01 F(3,12,0.01) critical value is 5.95. There is significant evidence that the average of the treatment means for each pesticide is not the same. 1d) The test for the presence of variety main effects is F = MS(var)/MS(error). F = 1998.042/42.29167 = 47.24433. df are 2 and 12. p-value < 0.01 F(2,12,0.01) critical value is 6.93. There is significant evidence that the average of the treatment means for each variety is not the same. 1e) I like to visualize the coefficient’s for a contrast in a matrix that has the same form as the means are in. Below are these 2 matrices: Means: Coefficients: 44 48 67 0.25 -0.25 0 52.5 62.5 88.5 0.25 -0.25 0 40.5 47.5 65.5 0.25 -0.25 0 50.5 79 92 0.25 -0.25 0 The SSC = (chat^2)/(sum((ki^2)/ri)). First calculate the linear combination of means: chat = ((44 + 52.5 + 40.5 + 50.5)*0.25) – ((48 + 62.5 + 47.5 + 79)*0.25) = -12.375. chat^2 = 153.1406 Next calculate the sum((ki^2)/ri). Since all the ri’s are equal to 2, we can pull it out of the summation. Also, since all of the ki^2’s that are not equal to zero are equal to 0.0625 our calculations are greatly simplified. So, the sum((ki^2)/ri) = (8*0.0625)/2 = 0.25. Thus, SSC = 153.1406/0.25 = 612.5625. 1f) Again visualize the coefficient’s for the contrast in a matrix that has the same form as the means are in. Below are these 2 matrices: Means: Coefficients: 44 48 67 1/8 1/8 -0.25 52.5 62.5 88.5 1/8 1/8 -0.25 40.5 47.5 65.5 1/8 1/8 -0.25 50.5 79 92 1/8 1/8 -0.25 The SSC = (chat^2)/(sum((ki^2)/ri)). First calculate the linear combination of means: chat = ((44 + 52.5 + 40.5 + 50.5 + 48 + 62.5 + 47.5 + 79)*0.125) - ((67 + 88.5 + 65.5 + 92)*0.25) = -25.1875 chat^2 = 634.4102. Next calculate the sum((ki^2)/ri). Again all the ri’s are equal to 2 and we can pull it out of the summation. Since 8 of the ki^2 = 0.125^2 = 0.015625 and 4 of the ki^2 = 0.0625, the sum((ki^2)/ri) = ((8*0.015625) + (4*0.0625))/2 = 0.375/2 = 0.1875. Thus, SSC = 634.4102/0.1875 = 3383.521. 1g) Yes, they are orthogonal. Two contrasts are orthogonal if the sum((coefficient contrast 1 * coefficients contrast 2)/r) = 0. Since our data is balanced, we can just observe the coefficients. Coefficients 1e: Coefficients 1f: product: 0.25 -0.25 0 1/8 1/8 -0.25 0.03125 -0.03125 0 0.25 -0.25 0 1/8 1/8 -0.25 0.03125 -0.03125 0 0.25 -0.25 0 1/8 1/8 -0.25 0.03125 -0.03125 0 0.25 -0.25 0 1/8 1/8 -0.25 0.03125 -0.03125 0 The products obviously sum to 0. 1h) SSC(1e) + SSC(1f) = 634.4102 + 3383.521 = 3996.083 which equals the SS for the variety factor. The reason for this equivalence is that these two contrasts reproduce the same test as in 1d. Both of these tests are testing to see if there are any differences among the treatment means for each variety. Notice that variety has 2 df. Therefore, the SS can be partitioned into 2 orthogonal contrasts. That is what we did in 1e and 1f. First we tested if the means of variety 1 and 2 are different. Then we tested if the mean of variety 1 and 2 combined is different from the mean of variety 3. These two tests combined are the same as testing if there are any differences among the treatment means for each variety as we did in 1d. 1i) A confidence interval is an estimate +/- t*SE. The coefficient matrix for calculating the estimate and the SE’s are below. Means: Coefficients: 44 48 67 1/6 1/6 1/6 52.5 62.5 88.5 -1/6 -1/6 -1/6 40.5 47.5 65.5 1/6 1/6 1/6 50.5 79 92 -1/6 -1/6 -1/6 The estimate = ((44 + 48 + 67 + 40.5 + 47.5 + 65.5)*(1/6)) –((52.5 + 62.5 + 88.5 + 50.5 + 79 + 92)*(1/6)) = -18.75. t(12,0.975) = 2.179. SE = sqrt(MSE*sum((ki^2)/ri) = sqrt((42.29167*12*0.027778)/2) = sqrt(7.048611) =2.654922. Note: (1/6)^2 = (-1/6)^2 = 0.027778 95% CI: -18.75 +/- 2.179*2.654922 -18.75 +/- 5.785075 (-24.5351, -12.9649) The insecticide of company B is more effective at increasing yields across all varieties than the insecticide of company A. The estimated average increase in yield is 18.75 bushels per acre. The 95% confidence interval of the estimated increase is 13 to 24.5 bushels per acre. 2) First run a two factor analysis, check the assumptions, and then look at the results of the analysis. proc glm; class temp sucrose; model y=temp sucrose temp*sucrose; output out=two residual=ehat predicted=yhat; run; proc plot; plot ehat*yhat; run; proc univariate plot; var ehat; run; The assumptions of independence, constant variance, and normality all appear to be met in this experiment. Next check if there is a significant interaction effect. The GLM Procedure Dependent Variable: y Sum of Source DF Squares Mean Square F Value Pr > F Model 8 630.2474074 78.7809259 87.07 <.0001 Error 18 16.2866667 0.9048148 Corrected Total 26 646.5340741 R-Square Coeff Var Root MSE y Mean 0.974809 8.795505 0.951218 10.81481 Source DF Type I SS Mean Square F Value Pr > F temp 2 293.1585185 146.5792593 162.00 <.0001 sucrose 2 309.9585185 154.9792593 171.28 <.0001 temp*sucrose 4 27.1303704 6.7825926 7.50 0.0010 Source DF Type III SS Mean Square F Value Pr > F temp 2 293.1585185 146.5792593 162.00 <.0001 sucrose 2 309.9585185 154.9792593 171.28 <.0001 temp*sucrose 4 27.1303704 6.7825926 7.50 0.0010 From the type III SS we note that there is a significant interaction effect (p= 0.001). Therefore, we need to conduct an analysis on each level of both factors. The slice command in SAS allows us to conduct such an analysis. proc glm; class temp sucrose; model y=temp sucrose temp*sucrose; lsmeans temp*sucrose / slice=temp; lsmeans temp*sucrose / slice=sucrose; run; The GLM Procedure Least Squares Means temp sucrose y LSMEAN 20 20 3.8333333 20 40 6.5000000 20 60 8.8000000 30 20 6.8000000 30 40 12.6000000 30 60 16.0000000 40 20 8.5000000 40 40 15.3000000 40 60 19.0000000 Least Squares Means temp*sucrose Effect Sliced by temp for y Sum of temp DF Squares Mean Square F Value Pr > F 20 2 37.068889 18.534444 20.48 <.0001 30 2 129.840000 64.920000 71.75 <.0001 40 2 170.180000 85.090000 94.04 <.0001 Least Squares Means temp*sucrose Effect Sliced by sucrose for y Sum of sucrose DF Squares Mean Square F Value Pr > F 20 2 33.468889 16.734444 18.49 <.0001 40 2 121.940000 60.970000 67.38 <.0001 60 2 164.880000 82.440000 91.11 <.0001 These results show that at every level of each factor, there is a significant difference between the means of the other factor (p<0.0001). To see the direction of the influence, graph the least squares means. These graphs depict that at a given temperature, energy expenditure increases as sucrose content increases. Similarly, at a given sucrose level, energy expenditure increases as ambient temperature increases. (See graphs posted on the course Web site next to these homework solutions.)