Homework #6 - due 5 pm, Friday, 29 Feb 2009

1) This problem is based on a commonly used design for sensory evaluations. Panelists are recruited to participate in the study. Around ISU, sensory panelists are often graduate students, staff, some faculty, and some community members. These panelists may be considered to be a random sample of some larger population (e.g. Ames residents). I'm sceptical about the validity of this, but we will not question it for the rest of this problem.

These data are made up based on a trial of tofu made from three different types of soybeans. One type (C) is a control; the other two (A and B) are soybean lines bred to reduce an off flavor (beaniness). Panelists are "trained" by tasting a very bland sample and a very beany-tasting sample. Then, each panelist tastes the three types of tofu. They judge the beaniness of each sample on a continuous scale from 0 to 15 (0 = not beany, 15 = very beany). There are a total of 15 participants in this study. The data in soytaste.txt includes 45 observations.

You decide to analyze the data using an RCBD model with fixed blocks. Use fixed blocks for parts a-d
a) Test the null hypothesis of no difference in beaniness among the three soybean lines. Report your test statistic and p-value.
b) Estimate the difference between soybean A and the control. Report the estimate and its s.e.
c) Estimate the mean beaniness for soybean type A. Report the mean and its s.e.
d) The developer of the soybean lines is very interested whether their type (A) has acceptable beaniness. Values less than 3 are considered acceptable. Test whether the mean for type A is significantly different from 3. Report your test statistic and p-value.
Hint: This may be easiest by hand, using information from c.

After discussion with class mates, you decide that blocks should really be considered random.
e-h) repeat a-d considering blocks to be a random effect.
i) Which among your answers to the previous parts are the same for fixed and random blocks? Which are different?
j) The producer of type A is considering making tofu and selling it in Ames. They will do this if they are reasonably certain that their product is acceptable to the Ames population. Remember that acceptable is defined as beaniness less than 3. Should they proceed?

2) The data for this problem are one small part of a survey of plankton (small aquatic invertebrates) in a single lake. Plankton are often sampled by towing a net through the water, then collecting, sorting to species, and counting the animals caught in the net. These data are from 12 separate tows of the net, taken in randomly chosen places in the lake. plankton.txt contains data on four species, reported as number per liter of water. The investigators want to estimate the abundance of each species and quantify the relationship between the four species. The total number of animals caught in each tow varies because plankton are more abundant in some parts of the lake. The investigators propose to treat tow as a blocking variable. In case you're wondering about the ecology, these 12 tows are samples of the same plankton community, so blocking is reasonable.

The first parts of this problem concern diagnostics; treat blocks as fixed effects (i.e. use proc glm if you're using SAS).

For parts a and b, use an additive block model.
a) The investigators are especially interested in the difference between species 1 and species 2. Estimate that difference and test whether it equals 0. Report the estimate, standard error and p-value.
b) The investigators are surprised by the result from part a, because it isn't what they expected. They ask you to check assumptions of the block model. Plot the abundances against the block means and plot residuals against predicted values. Do the additivity and equal variance assumptions seem reasonable?
c) A friend suggests a log transformation. That is, analyze log(abundance) using an additive block model. Fit this model and check the assumptions. Do additivity and equal variances seem reasonable after transformation?
Hint/reminder: In SAS, you log transform values in a data step by inserting
logY = log(Y);
after the input command and before the cards; or run; statements.
d) Estimate and test the difference in log abundance between species 1 and 2. Report the estimate (on log scale), standard error and p-value.

3) A group of investigators are interested in the effect of planting density on the yield of millet, a small grain. The investigators are comparing five planting densities, labelled 2, 4, 6, 8, and 10.
a) In their first study, the investigators randomly assigned density treatments to plots in a 5 x 5 Latin Square. The data are in millet.txt
Test the hypothesis of no effect of treatment (i.e. no differences between planting densities). Report the F statistic and p-value.
b) If the study is repeated next year, would you recommend it still be a Latin Square? Or, should the investigators just use row blocks, just use column blocks, or just use a CRD? Explain briefly.
c) The next year, the investigators repeated the study using a larger experiment, to give more precise estimates of the treatment means. They used 2 fields, each with a 5 x 5 Latin Square. The same five treatments are used in both fields. The data are in millet2.txt. Test the hypothesis of no differences between treatments. Report the F statistic and p-value.