In this issue:
Peter Bryant
College of Business
University of Colorado at Denver
Denver, CO 80217-3364
pbryant@castle.cudenver.edu
303-556-5833
Elsewhere in this issue, you will find the report of the CSNA nominating committee, containing their nominations for two positions on the CSNA Board of Directors and for the position of CSNA Business Manager. I thank the nominating committee and the candidates for their help and participation.
I hope you will review that list of candidates, and note that other candidates may also be nominated. The field of classification takes on many new forms and touches on many new disciplines. Please make your views known on the directions we should be taking, and please vote when you receive your ballot this fall.
Plans are proceeding for CSNA '97, to be held June 12-14, 1997 at American University in Washington, D.C. The local arrangements are begin handled by Professor Olga Cordero-Braña of American University. The Program Chair will be Professor David Banks of Carnegie-Mellon University. More detailed announcements and a Call for Papers will be forthcoming this fall, but now would be a good time to mark your calendars and plan to attend. The Washington area offers much by way of vacation and entertainment opportunities for families, and the preliminary program discussions suggest the conference will be a stimulating one. I hope to see you there.
Dawn Iacobucci
Department of Marketing
Kellogg Graduate School of Management
Northwestern University
2001 Sheridan Road
Evanston, IL 60208
The nominating committee, consisting of Martha Cooper (chair), David Banks, and Mike Windham has proposed to following slate of candidates for CSNA offices:
For CSNA Directors (to fill two slots):
Don Dearholt, Mississippi State University
Boris Mirkin, Rutgers University
F. James Rohlf, SUNY at Stony Brook
Olga Cordero-Braña, American University
For CSNA Business Manager
Stanley L. Sclove, University of Illinois at Chicago
Pursuant to CSNA bylaws, other nominations are solicited. The bylaws provide that any candidate receiving 5 nominations will also appear on the ballot. Such nominations should be received by the secretary by November 1.
Ballots will include short biographical information provided by the candidates. This year, the ballots must be mailed, though we hope to have electronic elections by the next time. We plan to mail them about November 15, with votes due to the secretary by December 31.
F.R. McMorris
Department of Mathematics
University of Louisville
Louisville, KY 40292
frmcmo01@homer.louisville.edu
(502)852-6826
I am extremely pleased to announce that David Banks has agreed to write regular Forum articles which should appear in nearly every issue. David should be stirring the pot a bit and I welcome responses, possibly in the form of other Forum articles.
BAYESIAN INFANTS
David Banks, Dept. of Statistics
Carnegie Mellon University, Pittsburgh, PA 15123
banks@stat.cmu.edu
Bayesian inference has engendered, answered, and counterproposed a number of paradoxes. Lindley's Paradox, Basu's Elephant, the St. Petersburg Problem, and Godambe's Paradox are four famous examples (cf. Shafer, 1982; Basu, 1971; Robbins, 1961; Genest and Schervish, 1985) but there are many others. This short piece describes a puzzle that does not have the dignity of paradox, but which nonetheless points up an intriguing aspect of pattern recognition in the context of the Bayesian and complexity theoretic philosophies.
Suppose a superintelligent, perfectly rational baby is born, then placed into a Skinner box. Each day, the baby must guess a digit, either zero or one; if the baby guesses correctly, she is fed. To avoid technicalities that arise from a countably infinite sample space, we assume that the trial persists for a fixed duration, and that this duration is known to the child.
Since the baby is so smart, many philosophers of statistics (in particular, my colleagues at CMU) would assert that she could deduce the superiority of Bayesianity from natural axiomatizations of probability and decision theory. Thus her forecasts should be Bayesian, picking the digit which maximizes her posterior probability conditional on the sequence of previously observed digits. The only issue is the determination of which prior the baby should choose.
However, since the baby is a baby, she has no experience of the world, and thus no predisposition to consider any sequence more likely than another. Given this state of perfect ignorance, her only coherent (this is a Bayesian buzzword, indicating that the child acts in a way consistent with the goal of not going hungry) choice of prior is the uniform distribution on all sequences of length equal to the number of days of in the box. (Some might argue for another model of perfect ignorance, in which the baby assumes that the digits are exchangeable, implying that all inference should be based only upon the total number of ones observed. But that model makes stronger assumptions, and leads to the same problem as the uniform prior.)
In this case, Bayesian analysis with a uniform prior is a mathematical formula for disaster. It turns out to exactly prevent the baby from learning about the next digit, no matter what sequence is presented. For example, if the baby saw 1000 zeroes in 1000 consecutive days, a short Bayesian calculation shows that the posterior probability of a zero tomorrow is unchanged at .5. But clearly, an ordinary kid would quickly eat the smart girl's lunch in any competition involving a patterned sequence of digits (or even an unpatterned sequence, if one digit or the other was shown with greater probability). Even if the digits were equiprobably random, a dumb kid would do as well as the smart one.
To add injury to this insult, the way to make the smart girl succeed is to damage her. For example, if we were to impair her memory, so that she could not recall a long sequence of outcomes, then she would necessarily develop useful summaries of the data, such as the proportion of ones or a short repeated pattern. Also, if her rationality were corrupted by superstitions, so that sequences were not all equiprobable, then in many cases she could learn quickly. Finally, if she were given stupefying drugs that made her too slow to engage in infinite sequences of mental wagers or long calculations, then again she would, under many circumstances, discover patterns as ably as any dull child.
Obviously, the example relies on the fact that dumb kids are predisposed by their limitations to look for simple patterns. Through both experience of the world and evolutionary adaptation, humans are hardwired to shun complexity, and this handicap is probably the valuable basis of the elegantly parsimonious understanding of physics and mathematics. Our slack-jawed prejudice for the simple has been incorporated into statistics through minimum description length methods (cf. Cover, 1991 and Rissanen, 1989).
A minimum description length method bases inference on the shortest way of rewriting the observed data. In the baby's case, the rewrite might involve three components: a model part, a parameter part, and a deviation part. Two examples illustrate this:
* If the experimenter chooses digits according to tosses of a biased coin with p = P[next digit is one], then the model part specifies independent Bernoulli trials, the parameter part estimates p, and the deviation part provides exception reporting when the less probable outcome occurred.
* If the experimenter chooses digits according to a simple pattern, such as alternating digits, then the model part specifies alternation of pattern, the parameter part specifies the elementary pattern (01 or 10), and the deviation part is empty.
It is not obvious that these components automatically produce the minimum length expression, but that is the subject of deeper study. Wallace and Freedman (1987) offer an introduction to the topic that is couched in statistician-friendly language.
If the experimenter were malicious, and if he thought that some unscientifically sympathetic lab assistant had slipped the baby a hint about the fact that (on empirical grounds) the real world tends to favor simplicity, then he could attempt to confound the baby's use of minimum description length inference by choosing a sequence of maximal complexity. It is known that there are many such sequences, but in general it is hard to prove that a specific sequence attains the maximum. However, such sequences will look like the typical outcomes of repeated tosses of a fair coin marked with a one and a zero; also, the sequences will have equal numbers ones and zeroes. In this case, it is impossible for the baby to do better than chance guessing.
Bayesians are aware that accumulated human experience (but no other reason) suggests the value of models that favor simplicity. They accomplish this aim by imposing roughness penalties, the BIC (a quasi-Bayesian alternative to Akaike's Information Criterion in model selection), the Schwarz criterion, and other somewhat ad hoc devices. Berger and Jefferys (1992) have tried to impose a more principled order on this chaos, but their solution is not one that an ignorant infant could plausibly be expected to deduce. I. J. Good has also addressed related issues in this area, and his ideas are always fresh and entertainingly presented. I recommend Good (1983, Part IV) as a point of entry to his thinking.
References
Basu D. (1971). An essay on the logical foundations of survey sampling, Part 1 (with discussion). In _Foundations of Statistical Inference_, ed. by Godambe, V.P. and Sprott, D.A. Holt, Rinehart, and Winston, Toronto, 203-242.
Berger, J. O. and Jefferys, W. H. (1992). The application of robust Bayesian analysis to hypothesis testing and Occam's Razor. _Journal fo the Italian Statistical Society_, 1, 17-32.
Cover, T. and Thomas, J. (1991). _Elements of Information Theory_.Wiley, New York.
Genest, C. and Schervish, M. J. (1985). Resolution of Godambe's paradox (with discussion). _Canadian Journal of Statistics_, 13, 293-301.
Good, I. J. (1983). _Good Thinking: The Foundations of Probability and Its Applications_. University of Minnesota Press.
Rissanen, J. (1989). _Stochastic Complexity in Statistical Inquiry_. World Scientific, New Jersey.
Robbins, H. (1961). Recurrent games and the Petersburg paradox. _Annals of Mathematical Statistics_, 32, 187- 194.
Shafer, G. (1982). Lindley's paradox (with discussion). _Journal of the American Statistical Association_, 77, 325-351.
Wallace, C. S. and Freeman, P. R. (1987). Estimation and inference by compact coding (with discussion). _Journal of the Royal Statistical Society, Series B_, 49, 240-252.
Rian van Blokland-Vogelesang
SWOV Institute for Road Safety Research
P.O. Box 170
2260 AD Leidschendam
The Netherlands
Blokland@SWOV.nl
The following recent and forthcoming publications might be of interest to CSNA members. (Prices are approximate. # represents pounds.)
C. Alexander, Financial Risk Management and Analysis, New York: Wiley, 1996, pp. 352, #55. ISBN 0471-95309-1.
C. Brand, The g Factor: General Intelligence and its Implications, New York: Wiley, 1995, pp. 175, #12.95, ISBN 0471-96070-5 (pbk); #24.95, ISBN 0-471-96069-1 (hbk).
D.S. Bridge and G.B. Mehta, Representations of Preference Orderings, New York: Springer, Lecture Notes in Economics and Mathematical Systems, Vol. 422, 1995, pp. 165, $23.-. ISBN 3-540- 58839-6.
H. Brown and R.J. Prescott, Analysis of Medical Data Using Mixed Models, New York: Wiley, 1997, pp. 260, #29.95. ISBN 0471-96554-5.
C.E. Buck, C.D. Litton, and W.G Cavanagh, The Bayesian Approach to Interpreting Archaeological Data, New York: Wiley, 1996, pp. 350, #29.95. ISBN 0471-96197-3.
H. Buhlmann (Ed.), International Actuarial Association's Centenary: Special Issue of Wiley Applied Stochastic Models and Data Analysis Journal, in Recognition of IAA's 100th Anniversary (includes applications of Risk Theory, Markov Chain Interest Model, Balanced Credibility Estimation, etc.), New York: Wiley, 1995, pp. 72, #70.-. ISBN 8755-0024.
B.G. Cox, B. Nanjamma, P.S. Kott, M. Colledge, D.A. Binder, and A. Christianson (Eds), Business Survey Methods, New York: Wiley, 1995, pp. 752, #115.-. ISBN 0471-59852-6.
T.F. Cox and M.M.A. Cox, Multidimensional Scaling, London (UK): Chapman and Hall, Monographs on Statistics and Applied Probability, Vol. 59, 1994, pp. 225, $16.-. ISBN 0-412-49120-6 (incl. 3.5" diskette).
D. Edwards, Introduction in Graphical Modelling, New York:Springer Verlag, Texts in Statistics, 1995, pp. 280, $39.-. ISBN 0-387- 94483-4.(incl. 3.5" diskette).
P.R. Freeman and A.F.M. Smith (Eds.), Aspects of Uncertainty: A Tribute to D.V. Lindley, New York: Wiley, 1994, pp. 412, #55.-. ISBN 0471-94347-9.
W. Fuller, Introduction to Statistical Time Series (2nd ed.), New York: Wiley, 1996, pp. 736, #60.-. ISBN 0471-55239-9.
A.E.G. Gelfand and A.F.M. Smith, Bayesian Computation, New York: Wiley, 1997, pp. 400, #45.-. ISBN 0471-93856-4.
H.U. Gerber, Life Insurance Mathematics (2nd ed.), New York: Springer, 1995, pp. 200, $36.-. ISBN 3-540-58858-2.
M. Ghosh, N. Mukhopadhyay and P.K. Sen, Sequential Estimation, New York: Wiley, 1996, pp. 400, #50.-. ISBN 0471-81271-4.
P. Gilster, Finding It on the Internet: The Internet Navigator's Guide to Search Tools and Techniques, 1996 (2nd ed.),New York: Wiley, 1996, pp. 400, #16.99. ISBN 0471-12695-0.
H. Goldstein and T. Lewis (Eds.), Assessment: Problems, Developments and Statistical Issues. New York: Wiley, 1996, pp. 280, #39.95. ISBN 0471-95668-6.
I. Graham, The HTML Sourcebook: A Complete Guide to HTML 3.0, 1996 (2nd ed.), New York: Wiley, 1996, pp. 640, #19.95. ISBN 0471- 14242-5.
H. Gzyl, The Method of Maximum Entropy, Singapore: Word Scientific Publishing, Series: Advances in Mathematics for Applied Sciences, Vol. 29, 1995, pp. 150, $48.-. ISBN 981-02-1812-5.
J.F. Hair Jr., R.E. Anderson, R.E. Tatham, and W.C. Black, Mutivariate Data Analysis with Readings, (4th ed.), Englewood Cliffs (NJ): Prentice Hall, 1995, pp. 770, $41.95.ISBN 0-13- 180969-5.
D.J. Hand, Construction and Assessment of Classification Rules, New York: Wiley, 1996, pp. 300, #34.95. ISBN 0471-96583-9.
P.E. Jupp and K.V. Mardia, Statistics of Directional Data (2nd ed.), New York: Wiley, 1996, pp. 400, #45.00. ISBN 0471-95333-4.
J.B. Kadane, Bayesian Methods and Ethics in a Clinical Trial Design, New York: Wiley, 1996, pp. 336, #55.00. ISBN 0471-84680-5.
M.J. Kolen and R.L. Brennan, Test Equating Methods and Practices, New York: Springer, 1995, pp. 340, $44.95. ISBN 0-387-94486-9.
V.G. Kulkarni, Modeling and Analysis of Stochastic Systems, London (UK): Chapman and Hall, 1995, pp. 630, $57.95. ISBN 0-412-04991-0.
K.J. Lindsey, Introductory Statistics: A Modelling Approach, Oxford (UK): Clarendon Press/Oxford Science Publications, 1995, pp. 225, $29.95, ISBN 0-19-852346-7 (hbk), $19.95, ISBN 0-19- 852345-9 (pbk).
D.S. Moore, Basic Practice of Statistics, New York: Freeman, 1995, pp. 700, $18.95. ISBN 0-7167-2628-9.
G.J. Olsder (Ed.), New Trends in Dynamic Games and Applications, Series: Annals of the International Society of Dynamic Games, Vol. 3), Boston: Birkhauser Verlag, 1995, pp.490, SFR 148.- (ca. $74.- ). ISBN 0-8176-3812-1.
G. Owen, Game Theory (3rd ed.), San Diego (Cal): Academic Press, 1995, pp.150, $42.-, ISBN 0-12-531151-6.
Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, Boston: Kluwer Academic Publishers, Series: Theory and Decision Library, Series D: System Theory, Knowledge Engineering and Problem Solving, 1995, pp. 240, $106.- ISBN 0-7923-1472-7.
A. Rao,, L.P. Carr, I.G. Dambolena, F. Rafii, R.J. Chop, J. Martin, and P. Fireman Schlesinger, Total Quality Management: A Cross Functional Approach, New York: Wiley, 1996, pp. 656, #30.-. ISBN 0471-10804-9 (hbk).
C.P. Robert, The Bayesian Choice: A Decision-Theoretic Motivation, New York: Springer, 1994, pp. 436, $45.-. ISBN 3-540-94296-3.
Sir Michael Rutter (Chairman), Genetics of Criminal and Antisocial Behavior: Ciba Foundation Symposium 194, New York: Wiley, 1995, pp.300, #49.95. ISBN 0471-95719-4.
C. Summers and B. Dunetz, ISDN: How to Get a High-Speed Connection to the Internet, New York: Wiley, 1996, pp. 368, #17.95. ISBN 0471-13326-4.
E.A. Weitzman, Computer Methods for Qualitative Data Analysis: A Software Sourcebook,London (UK):Sage, 1995, pp. 384, \.#44.95, ISBN 0-8039-5536-7 (hbk); #19.95, ISBN 0-8039-5537-5, \.19.95.
T.H. Wonnacott and R.J. Wonnacott, Introductory Statistics (5th ed.), New York: Wiley, pp. 720, $25.-, ISBN 0471-61518-8.
J.H. Zarr, Biostatistical Analysis (3rd ed.), Englewood Cliffs NJ: Prentice Hall, 1996, pp. 800, $40.95. ISBN 0-13-086398-X.