HW 7 solutions 1) A similar approach as in homework 6 will help us calculate the SS. Since there are three variety means for pesticide 1, there are 2 d.f., and 2 orthogonal contrasts are needed to partition these 2 df for the comparison of the 3 variety means. Below are the matrices for two possible orthogonal contrasts. Contrast 1 Contrast 2 Means: Coefficients: Coefficients: 44 48 67 1 -1 0 0.5 0.5 -1 52.5 62.5 88.5 0 0 0 0 0 0 40.5 47.5 65.5 0 0 0 0 0 0 50.5 79 92 0 0 0 0 0 0 The SSC = (chat^2)/(sum((ki^2)/ri)). SSC1: (44 – 48)^2 / (1^2 + -1^2)/2 = 16. SSC2: (44*0.5 + 48*0.5 – 67)^2 / (0.5^2 + 0.5^2 + -1^2)/2 = 588. Therefore, the SS(pesticide 1) = SSC1 + SSC2 = 16 + 588 = 604. Only the line for pesticide 1 was required, but if you would like to try some of the other lines, repeat the process for pesticides 2, 3, and 4 to obtain the results below. Pesticide DF SS MS F Pr > F 1 2 604 302 7.14 0.0091 2 2 1381.33 690.67 16.33 0.0004 3 2 665.33 332.67 7.87 0.0066 4 2 1802.33 901.17 21.31 0.0001 Total 8 4453 <-- This total is equal to variety + variety*pesticide. Note: You are not expected to get exact p-values. P-value < 0.01 is an adequate answer. In each case the denominator of the F statistic is the MSE from the fit of the full model with 9 treatment means and d.f.=24-12=12. Therefore the p-values can be obtained by comparing to an F distribution with 2 and 12 d.f. 2a) The SAS code to produce the table analogous to Table 6.11 is below. Note that this code is also used to check the assumptions which this data set meets. proc glm; model y = temp temp*temp sucrose sucrose*sucrose temp*sucrose temp*sucrose*sucrose temp*temp*sucrose temp*temp*sucrose*sucrose; output out=two residual=ehat predicted=yhat; run; proc plot; plot ehat*yhat; run; proc univariate plot; var ehat; run; Source DF SS MS F Pr > F Total 26 646.53 Temp(T) 2 293.16 146.58 162 <0.0001 T lin 1 280.06 280.06 309.52 <0.0001 T quad 1 13.1 13.1 14.48 0.0013 Sucrose(S) 2 309.96 154.98 171.28 <0.0001 S lin 1 304.22 304.22 336.23 <0.0001 S quad 1 5.74 5.74 6.34 0.0215 T x S (TS) 4 27.13 6.78 7.5 0.001 T lin x S lin 1 22.96 22.96 25.38 <0.0001 T lin x S quad 1 1.87 1.87 2.06 0.1679 T quad x S lin 1 2.15 2.15 2.38 0.1405 T quad x S quad 1 0.15 0.15 0.16 0.6905 Error 18 16.29 0.9 2b) y = B0 + B1*T + B2*T^2 + B3*S + + B4*S^2 + B5*(T*S). 2c) The SAS code below generates the information needed to conduct the Full vs. reduced model (lack of fit) test. /* reduced model */ proc glm; model y = temp temp*temp sucrose sucrose*sucrose temp*sucrose; run; Sum of Source DF Squares Mean Square F Value Pr > F Model 5 626.0803704 125.2160741 128.56 <.0001 Error 21 20.4537037 0.9739859 Corrected Total 26 646.5340741 /* full model */ proc glm; class temp sucrose; model y = temp sucrose temp*sucrose; run; Sum of Source DF Squares Mean Square F Value Pr > F Model 8 630.2474074 78.7809259 87.07 <.0001 Error 18 16.2866667 0.9048148 Corrected Total 26 646.5340741 F = [(20.45 – 16.29)/(16.29/18) = 1.54 Df 3 & 18 p-value = 0.2397 Conclusion: There is no evidence of a lack of fit. Therefore we accept the model in 2b. The following SAS code will give you all the output needed to answer questions 2d-f. proc glm; model y = temp temp*temp sucrose sucrose*sucrose temp*sucrose / solution clparm; estimate 'energy for 50% sucrose, 30 degrees' intercept 1 temp 30 sucrose 50 temp*temp 900 sucrose*sucrose 2500 temp*sucrose 1500; run; 2d) The parameter estimates are needed to calculate the equation. Below is the output from SAS. Parameter Estimate Intercept -16.51481481 temp 1.00444444 temp*temp -0.01477778 sucrose 0.19361111 sucrose*sucrose -0.00244444 temp*sucrose 0.00691667 y = B0 + B1*T + B2*T^2 + B3*S + B4*S^2 + B5*(T*S) y = -16.51 + 1*30 + (-0.01)*30^2 + 0.19*S + (-0.002)*S^2 + 0.007*30*S y = 0.32 + 0.4*S + (-0.002)*S^2 2e) Using the equation in 2(d): y = 0.32 + 0.4*50 + (-0.002)*50^2 = 15.32. If you used SAS, or did your calculations without rounding y = 14.26. Parameter Estimate energy for 50% sucrose, 30 degrees 14.2629630 2f) The 95% confidence interval is (13.46,15.07) Parameter 95% Confidence Limits energy for 50% sucrose, 30 degrees 13.4577181 15.0682079 3) The answers to most parts of this problem can be read from the output produced by the SAS program ch6pr6sol.sas. 3a) Below are all the means: surface filler p y LSMEAN Error Pr > |t| 1 1 25 201.000000 11.592023 <.0001 1 1 50 237.000000 11.592023 <.0001 1 1 75 267.000000 11.592023 <.0001 1 2 25 213.000000 11.592023 <.0001 1 2 50 233.500000 11.592023 <.0001 1 2 75 234.500000 11.592023 <.0001 2 1 25 164.000000 11.592023 <.0001 2 1 50 187.500000 11.592023 <.0001 2 1 75 232.000000 11.592023 <.0001 2 2 25 148.500000 11.592023 <.0001 2 2 50 113.500000 11.592023 <.0001 2 2 75 143.500000 11.592023 <.0001 You are asked to report S1, F2, 50%: mean = 233.5 SE = 11.59 3b) Below is from the output from SAS. It is the estimated mean for each filler proportion average over filler types and surface treatments. Standard p y LSMEAN Error Pr > |t| 25 181.625000 5.796012 <.0001 50 192.875000 5.796012 <.0001 75 219.250000 5.796012 <.0001 3c) Yes. By examining the p-values in the Type I output, we can see that the terms indicated by "-->" can be dropped from the model. Source DF Type I SS Mean Square F Value Pr > F surface (S) 1 26268.16667 26268.16667 97.74 <.0001 filler (F) 1 6800.66667 6800.66667 25.30 0.0003 p 1 5662.56250 5662.56250 21.07 0.0006 --> p*p 1 305.02083 305.02083 1.13 0.3077 S*F 1 3952.66667 3952.66667 14.71 0.0024 --> p*S 1 150.06250 150.06250 0.56 0.4693 --> p*p*S 1 1036.02083 1036.02083 3.85 0.0732 p*F 1 3451.56250 3451.56250 12.84 0.0038 --> p*p*F 1 77.52083 77.52083 0.29 0.6010 --> p*S*F 1 203.06250 203.06250 0.76 0.4018 --> p*p*S*F 1 275.52083 275.52083 1.03 0.3313 The appropriate regression model is y = B0 + B1*S + B2*F + B3*p + B4*(S*F) + B5*(p*F) where S=1 if surface=1 and 0 otherwise and F=1 if filler=1 and 0 otherwise. The estimates for the parameters are below: Standard Parameter Estimate Error t Value Pr > |t| Intercept 126.9166667 B 13.97379602 9.08 <.0001 surface 91.8333333 B 9.88096593 9.29 <.0001 filler 0.5833333 B 19.76193185 0.03 0.9768 p 0.1650000 B 0.24203325 0.68 0.5041 surface*filler -51.3333333 B 13.97379602 -3.67 0.0017 p*filler 1.1750000 B 0.34228670 3.43 0.0030 3d) You should recommend the treatment that leads to the lowest mean because fabric loss is a bad thing in this situation. If you examine the output from the estimate statements following the fit of the simplified model you will see that we have the following regression lines relating proportion of filler to mean fabric weight loss. S1 F1 y^hat=168.00 + 1.340p S1 F2 y^hat=218.75 + 0.165p S2 F1 y^hat=127.50 + 1.340p S2 F2 y^hat=126.92 + 0.165p S2 F2 appears to be the best combination for preventing fabric loss. This estimated regression line for S2 F2 lies below all the others. There were no significant differences among the proportions of filler for the S2 F2 combination because the slope 0.165 is not significantly different from zero (p-value 0.5041). Thus S2, F2, and any proportion (25, 50, 75%) would be a reasonable answer. You can write estimate statements to test various differences to verify that any choice other than S2 F2 will be inferior. An example estimate statement is provided that shows how to compare S2, F1, p=25 to S2, F2, p=25. There is a multiple testing issue here that you may worry about, but I would not hesitate to recommend S2 F2 to the company if it was necessary to recommend one combination of surface material and filler. Try to understand all the code contained in ch6pr6sol.sas and how it can be used to answer the questions. It would be a good idea to sketch the four lines to help you understand the answer.