Stephen J. Willson

Mathematics Department
Iowa State University
Ames, IA 50011

Office: 411 Carver

Telephone: (515)-294-7671

FAX: (515)-294-5454


Web page: homepage

For more information about the Laurence H. Baker Center for Bioinformatics and Biological Statistics click CBBS

Interests in computational biology:

A central problem in computational biology is the building of phylogenetic trees given analogous DNA strings of known species. Many methods exist for the construction of such trees. (See Felsenstein's software overview or Glasgow software overview for many interesting links.)

I have developed procedures for building phylogenetic trees using information on the aligned sequences of four species at a time. For each such "quartet" a method is used to determine the appropriate tree for those species alone. I then build an overall phylogenetic tree by fitting together the trees for each quartet.

Several methods may be used to compute the tree for each quartet. The most common methods are maximum parsimony, maximum likelihood, or neighbor-joining. I have also developed a new method called Higher Order Parsimony, which introduces corrections to maximum parsimony calculations and thereby reduces the effects of long-branch attraction. It is much faster than maximum likelihood, more accurate than maximum parsimony, and comparable in accuracy to maximum likelihood on artificial datasets. See my paper "A higher order parsimony method to reduce long-branch attraction," Molecular Biology and Evolution 16 (1999): 694-705.

All these methods yield in addition an internally generated numerical measure of the confidence that we have in the correctness of the quartet. A tree T for all species may then be given a numerical measure I(T) of "inconsistency". If the inconsistency I(T) of a tree T is large and positive, then it contradicts the information for some quartet about which we have high confidence. If the inconsistency I(T) of a tree T is small and positive, then the only quartets whose trees it contradicts are those for which the evidence was shaky. If the inconsistency I(T) is negative, then the tree T does not contradict the evidence fromany quartet.

My method of fitting together the trees for each quartet involves seeking a tree T of low inconsistency. Details may be found in my paper "Measuring inconsistency in phylogenetic trees," Journal of Theoretical Biology 190 (1998) 15-36.

An alternative method of building trees makes use of local inconsistency measures. This numerical measure gives finer resolution and discrimination when trying to place a new taxon into a phylogenetic tree. In particular, one can build a tree by always choosing the strongest "signal strength." Details may be found in my paper "Building phylogenetic trees from quartets by using local inconsistency measures," Molecular Biology and Evolution 16 (1999): 685-693.

Any of the quartet methods may be "sharpened" by an error-correction map described in "An error correcting map for quartets can improve the signals for phylogenetic trees," preprint (2000). The idea is that information about the tree for taxa {A,B,C,D} may be found by looking at the trees for {B,C,D,E}, {A,C,D,E}, {A,B,D,E}, and {A,B,C,E}. If they fit together into a tree for all of A,B,C,D, and E, then the tree for {A,B,C,D} in particular is determined. In this way the trees for other quartets yield information about the tree for {A,B,C,D}, and sometimes this information can be used to "correct" the tree given in the quartet list for {A,B,C,D}.

For my software for building phylogenetic trees, click software

Last updated January 8, 2003.