HW 7 solutions
1) A similar approach as in homework 6 will help us calculate the SS.
Since there are three variety means for
pesticide 1, there are 2 d.f., and 2 orthogonal contrasts are needed to
partition these 2 df for the comparison of the 3 variety means.
Below are the matrices for two possible orthogonal contrasts.
Contrast 1 Contrast 2
Means: Coefficients: Coefficients:
44 48 67 1 -1 0 0.5 0.5 -1
52.5 62.5 88.5 0 0 0 0 0 0
40.5 47.5 65.5 0 0 0 0 0 0
50.5 79 92 0 0 0 0 0 0
The SSC = (chat^2)/(sum((ki^2)/ri)).
SSC1: (44 – 48)^2 / (1^2 + -1^2)/2 = 16.
SSC2: (44*0.5 + 48*0.5 – 67)^2 / (0.5^2 + 0.5^2 + -1^2)/2 = 588.
Therefore, the SS(pesticide 1) = SSC1 + SSC2 = 16 + 588 = 604.
Only the line for pesticide 1 was required, but if you would like
to try some of the other lines, repeat the process for pesticides
2, 3, and 4 to obtain the results below.
Pesticide DF SS MS F Pr > F
1 2 604 302 7.14 0.0091
2 2 1381.33 690.67 16.33 0.0004
3 2 665.33 332.67 7.87 0.0066
4 2 1802.33 901.17 21.31 0.0001
Total 8 4453 <-- This total is equal to variety + variety*pesticide.
Note: You are not expected to get exact p-values. P-value < 0.01
is an adequate answer. In each case the denominator of the F
statistic is the MSE from the fit of the full model with 9
treatment means and d.f.=24-12=12. Therefore the p-values can
be obtained by comparing to an F distribution with 2 and 12 d.f.
2a) The SAS code to produce the table analogous to Table 6.11
is below. Note that this code is also used to check the assumptions
which this data set meets.
proc glm;
model y = temp temp*temp sucrose sucrose*sucrose
temp*sucrose temp*sucrose*sucrose temp*temp*sucrose
temp*temp*sucrose*sucrose;
output out=two residual=ehat predicted=yhat;
run;
proc plot;
plot ehat*yhat;
run;
proc univariate plot;
var ehat;
run;
Source DF SS MS F Pr > F
Total 26 646.53
Temp(T) 2 293.16 146.58 162 <0.0001
T lin 1 280.06 280.06 309.52 <0.0001
T quad 1 13.1 13.1 14.48 0.0013
Sucrose(S) 2 309.96 154.98 171.28 <0.0001
S lin 1 304.22 304.22 336.23 <0.0001
S quad 1 5.74 5.74 6.34 0.0215
T x S (TS) 4 27.13 6.78 7.5 0.001
T lin x S lin 1 22.96 22.96 25.38 <0.0001
T lin x S quad 1 1.87 1.87 2.06 0.1679
T quad x S lin 1 2.15 2.15 2.38 0.1405
T quad x S quad 1 0.15 0.15 0.16 0.6905
Error 18 16.29 0.9
2b) y = B0 + B1*T + B2*T^2 + B3*S + + B4*S^2 + B5*(T*S).
2c) The SAS code below generates the information needed to conduct the
Full vs. reduced model (lack of fit) test.
/* reduced model */
proc glm;
model y = temp temp*temp sucrose sucrose*sucrose temp*sucrose;
run;
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 5 626.0803704 125.2160741 128.56 <.0001
Error 21 20.4537037 0.9739859
Corrected Total 26 646.5340741
/* full model */
proc glm;
class temp sucrose;
model y = temp sucrose temp*sucrose;
run;
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 8 630.2474074 78.7809259 87.07 <.0001
Error 18 16.2866667 0.9048148
Corrected Total 26 646.5340741
F = [(20.45 – 16.29)/(16.29/18) = 1.54
Df 3 & 18
p-value = 0.2397
Conclusion: There is no evidence of a lack of fit. Therefore we accept
the model in 2b.
The following SAS code will give you all the output needed to answer
questions 2d-f.
proc glm;
model y = temp temp*temp sucrose sucrose*sucrose temp*sucrose
/ solution clparm;
estimate 'energy for 50% sucrose, 30 degrees'
intercept 1 temp 30 sucrose 50 temp*temp 900
sucrose*sucrose 2500 temp*sucrose 1500;
run;
2d) The parameter estimates are needed to calculate the equation.
Below is the output from SAS.
Parameter Estimate
Intercept -16.51481481
temp 1.00444444
temp*temp -0.01477778
sucrose 0.19361111
sucrose*sucrose -0.00244444
temp*sucrose 0.00691667
y = B0 + B1*T + B2*T^2 + B3*S + B4*S^2 + B5*(T*S)
y = -16.51 + 1*30 + (-0.01)*30^2 + 0.19*S + (-0.002)*S^2 + 0.007*30*S
y = 0.32 + 0.4*S + (-0.002)*S^2
2e) Using the equation in 2(d):
y = 0.32 + 0.4*50 + (-0.002)*50^2 = 15.32.
If you used SAS, or did your calculations without rounding y = 14.26.
Parameter Estimate
energy for 50% sucrose, 30 degrees 14.2629630
2f) The 95% confidence interval is (13.46,15.07)
Parameter 95% Confidence Limits
energy for 50% sucrose, 30 degrees 13.4577181 15.0682079
3) The answers to most parts of this problem can be read from the
output produced by the SAS program ch6pr6sol.sas.
3a) Below are all the means:
surface filler p y LSMEAN Error Pr > |t|
1 1 25 201.000000 11.592023 <.0001
1 1 50 237.000000 11.592023 <.0001
1 1 75 267.000000 11.592023 <.0001
1 2 25 213.000000 11.592023 <.0001
1 2 50 233.500000 11.592023 <.0001
1 2 75 234.500000 11.592023 <.0001
2 1 25 164.000000 11.592023 <.0001
2 1 50 187.500000 11.592023 <.0001
2 1 75 232.000000 11.592023 <.0001
2 2 25 148.500000 11.592023 <.0001
2 2 50 113.500000 11.592023 <.0001
2 2 75 143.500000 11.592023 <.0001
You are asked to report S1, F2, 50%: mean = 233.5 SE = 11.59
3b) Below is from the output from SAS. It is the estimated mean for
each filler proportion average over filler types and surface treatments.
Standard
p y LSMEAN Error Pr > |t|
25 181.625000 5.796012 <.0001
50 192.875000 5.796012 <.0001
75 219.250000 5.796012 <.0001
3c) Yes. By examining the p-values in the Type I output, we can see
that the terms indicated by "-->" can be dropped from the model.
Source DF Type I SS Mean Square F Value Pr > F
surface (S) 1 26268.16667 26268.16667 97.74 <.0001
filler (F) 1 6800.66667 6800.66667 25.30 0.0003
p 1 5662.56250 5662.56250 21.07 0.0006
--> p*p 1 305.02083 305.02083 1.13 0.3077
S*F 1 3952.66667 3952.66667 14.71 0.0024
--> p*S 1 150.06250 150.06250 0.56 0.4693
--> p*p*S 1 1036.02083 1036.02083 3.85 0.0732
p*F 1 3451.56250 3451.56250 12.84 0.0038
--> p*p*F 1 77.52083 77.52083 0.29 0.6010
--> p*S*F 1 203.06250 203.06250 0.76 0.4018
--> p*p*S*F 1 275.52083 275.52083 1.03 0.3313
The appropriate regression model is
y = B0 + B1*S + B2*F + B3*p + B4*(S*F) + B5*(p*F)
where S=1 if surface=1 and 0 otherwise
and F=1 if filler=1 and 0 otherwise.
The estimates for the parameters are below:
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept 126.9166667 B 13.97379602 9.08 <.0001
surface 91.8333333 B 9.88096593 9.29 <.0001
filler 0.5833333 B 19.76193185 0.03 0.9768
p 0.1650000 B 0.24203325 0.68 0.5041
surface*filler -51.3333333 B 13.97379602 -3.67 0.0017
p*filler 1.1750000 B 0.34228670 3.43 0.0030
3d) You should recommend the treatment that leads to the lowest mean because
fabric loss is a bad thing in this situation. If you examine the output
from the estimate statements following the fit of the simplified model you
will see that we have the following regression lines relating proportion of
filler to mean fabric weight loss.
S1 F1 y^hat=168.00 + 1.340p
S1 F2 y^hat=218.75 + 0.165p
S2 F1 y^hat=127.50 + 1.340p
S2 F2 y^hat=126.92 + 0.165p
S2 F2 appears to be the best combination for preventing fabric loss.
This estimated regression line for S2 F2 lies below all the others.
There were no significant differences among the proportions of filler
for the S2 F2 combination because the slope 0.165 is not significantly
different from zero (p-value 0.5041). Thus S2, F2, and any proportion
(25, 50, 75%) would be a reasonable answer.
You can write estimate statements to test various differences to verify
that any choice other than S2 F2 will be inferior. An example estimate
statement is provided that shows how to compare S2, F1, p=25 to S2, F2, p=25.
There is a multiple testing issue here that you may worry about, but I
would not hesitate to recommend S2 F2 to the company if it was
necessary to recommend one combination of surface material and filler.
Try to understand all the code contained in ch6pr6sol.sas and how it
can be used to answer the questions. It would be a good idea to
sketch the four lines to help you understand the answer.