1) This problem is based on a commonly used design for sensory evaluations. Panelists are recruited to participate in the study. Around ISU, sensory panelists are often graduate students, staff, some faculty, and some community members. These panelists may be considered to be a random sample of some larger population (e.g. Ames residents). I'm sceptical about the validity of this, but we will not question it for the rest of this problem.
These data are made up based on a trial of tofu made from three different types of soybeans. One type (C) is a control; the other two (A and B) are soybean lines bred to reduce an off flavor (beaniness). Panelists are "trained" by tasting a very bland sample and a very beany-tasting sample. Then, each panelist tastes the three types of tofu. They judge the beaniness of each sample on a continuous scale from 0 to 15 (0 = not beany, 15 = very beany). There are a total of 15 participants in this study. The data in soytaste.txt includes 45 observations.
You decide to analyze the data using an RCBD model with fixed
blocks. Use fixed blocks for parts a-d
a) Test the null hypothesis of no difference in beaniness
among the three soybean lines. Report your test statistic and
p-value.
b) Estimate the difference between soybean A and the control.
Report the estimate and its s.e.
c) Estimate the mean beaniness for soybean type A. Report the mean
and its s.e.
d) The developer of the soybean lines is very interested whether
their type (A) has acceptable beaniness. Values less than 3 are
considered acceptable. Test whether the mean for type A is
significantly different from 3. Report your test statistic and
p-value.
Hint: This may be easiest by hand, using information from c.
After discussion with class mates, you decide that blocks should
really be considered random.
e-h) repeat a-d considering blocks to be a random effect.
i) Which among your answers to the previous parts are the same for
fixed and random blocks? Which are different?
j) The producer of type A is considering making tofu and selling it
in Ames. They will do this if they are reasonably certain that
their product is acceptable to the Ames population. Remember that
acceptable is defined as beaniness less than 3. Should they proceed?
2) The data for this problem are one small part of a survey of plankton (small aquatic invertebrates) in a single lake. Plankton are often sampled by towing a net through the water, then collecting, sorting to species, and counting the animals caught in the net. These data are from 12 separate tows of the net, taken in randomly chosen places in the lake. plankton.txt contains data on four species, reported as number per liter of water. The investigators want to estimate the abundance of each species and quantify the relationship between the four species. The total number of animals caught in each tow varies because plankton are more abundant in some parts of the lake. The investigators propose to treat tow as a blocking variable. In case you're wondering about the ecology, these 12 tows are samples of the same plankton community, so blocking is reasonable.
The first parts of this problem concern diagnostics; treat blocks as fixed effects (i.e. use proc glm if you're using SAS).
For parts a and b, use an additive block model.
a) The investigators are especially interested in the difference between species
1 and species 2. Estimate that difference and test whether it equals 0.
Report the estimate, standard error and p-value.
b) The investigators are surprised by the result from part a, because it isn't what they
expected. They ask you to check assumptions of the block model. Plot
the abundances against the block means and plot residuals against predicted
values. Do the additivity and equal variance assumptions seem reasonable?
c) A friend suggests a log transformation. That is, analyze log(abundance)
using an additive block model. Fit this model and check the assumptions.
Do additivity and equal variances seem reasonable after transformation?
Hint/reminder: In SAS, you log transform values in a data step by inserting
logY = log(Y);
after the input command and before the cards; or run; statements.
d) Estimate and test the difference in log abundance between species 1 and 2.
Report the estimate (on log scale), standard error and p-value.
3) A group of investigators are interested in the effect of planting
density on the yield of millet, a small grain.
The investigators are comparing five planting densities, labelled 2,
4, 6, 8, and 10.
a) In their first study, the investigators randomly assigned density
treatments to plots in a 5 x 5 Latin Square. The data are in millet.txt
Test the hypothesis of
no effect of treatment (i.e. no
differences between planting densities). Report the F statistic and
p-value.
b) If the study is repeated next year, would you recommend it still
be a Latin Square? Or, should the investigators just use row
blocks, just use column blocks, or just use a CRD? Explain briefly.
c) The next year, the investigators repeated the study using a larger
experiment, to give more precise estimates of the treatment means.
They used 2 fields, each with a 5 x 5 Latin Square. The same five
treatments are used in both fields. The data are in
millet2.txt.
Test the hypothesis of no differences between treatments.
Report the F statistic and p-value.