Problem a:
Ames Lab wants to relate the grain yield of rice varieties, y,to the number of tillers, x . They conducted experiments for some rice varieties and tillers. Below there are the results obtained:
|
Grain
Yield, |
Tillers, |
|
4,862 |
160 |
|
5,244 |
175 |
|
5,128 |
192 |
|
5,052 |
195 |
|
5,298 |
238 |
|
5,410 |
240 |
|
5,234 |
252 |
|
5,608 |
282 |
1. We obtain the least squares line
The scatter diagram for the data
|
|
Step 3 Compute an estimator,
s2, for the variance
s 2 of the random error e :
![]()
where
.
The result of computations gives s2 = 16,229.66, s = 127.39. The value of s implies that most of the observed 8 values will fall within 2s = 254.78 of their respective predicted values.
2
Suppose the researchers want to predict the grain yield if the tillers are 210 per m2, i.e., xp =210. The predicted value is
.
3
If we want a 95% prediction interval, we calculate

Thus, the model yields a 95% prediction interval for the grain yield for the given value 210 of tillers from 4867.82 kg/ha to 5530.18 kg/ha.
4.
test the hypothesis that the slope B is 0, i.e., there is no linear relationship between the grain yield, y, and the tillers, x. We test:
![]()
Test statistic:
![]()
For the significance
level a = 0.05, we will reject H0 if
,
where
is
based on (n-2) = (8 – 2) = 6 df.
On this df we find t0.025 = 2.447,
.
This t-value is greater than t0.025. Thus,
we reject the hypothesis B = 0.
Problem b:
> rent<-read.table("http://www.statsci.org/data/oz/rentcap.txt", header=T)
> r<-rent[,2]
> c<-rent[,1]
> cor(c,r)
[1] 0.7868013
> regdata<-data.frame(cbind(r,c))
> regout=lm(r~1+c,regdata)
> summary(regout)
Call:
lm(formula = r ~ 1 + c, data = regdata)
Residuals:
Min 1Q Median 3Q Max
-2734.4 -1018.9 -163.6 787.7 3943.2
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.840e+03 4.126e+02 11.73 <2e-16 ***
c 2.745e-02 2.221e-03 12.36 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1482 on 94 degrees of freedom
Multiple R-Squared: 0.6191, Adjusted R-squared: 0.615
F-statistic: 152.8 on 1 and 94 DF, p-value: < 2.2e-16
> reg.aov<-aov(r~c,regdata)
> summary(reg.aov)
Df Sum Sq Mean Sq F value Pr(>F)
c 1 335671928 335671928 152.76 < 2.2e-16 ***
Residuals 94 206559807 2197445
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> plot(regout)
Waiting to confirm page change...

Waiting to confirm page change...

Waiting to confirm page change...

Waiting to confirm page change...

> rr<-c(r[1:12],r[14:22],r[25:96])
> cc<-c(c[1:12],c[14:22],c[25:96])
> regdata2<-data.frame(cbind(rr,cc))
> regout2=lm(rr~1+cc,regdata2)
> summary(regout2)
Call:
lm(formula = rr ~ 1 + cc, data = regdata2)
Residuals:
Min 1Q Median 3Q Max
-2293.3 -956.2 -162.6 738.3 3994.9
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.728e+03 3.930e+02 12.03 <2e-16 ***
cc 2.781e-02 2.094e-03 13.29 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1377 on 91 degrees of freedom
Multiple R-Squared: 0.6598, (Improved) Adjusted R-squared: 0.6561
F-statistic: 176.5 on 1 and 91 DF, p-value: < 2.2e-16
> reg2.aov<-aov(rr~1+cc,regdata2)
> summary(reg2.aov)
Df Sum Sq Mean Sq F value Pr(>F)
cc 1 334904340 334904340 176.53 < 2.2e-16 ***
Residuals 91 172645774 1897206
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> plot(regout2)
Waiting to confirm page change...

Waiting to confirm page change...

Waiting to confirm page change...

Waiting to confirm page change...
