The "Four-Taxon Problem" and the "Three-Taxon Statement"

      Among four taxa A, B, C, & D, there are three hypotheses of relationship:
            either A is most closely related to B, or to C, or to D
      We want to  evaluate the hypotheses in the form of the Three-Taxon Statement:
       "X and Y are more closely related to each other than either is to Z"
            The alternative hypotheses can be shown as networks with four branches and an internode

If (for example), A is most closely related to B
       A & B will share characters inherited from their common ancestor
       These changes will occur on internode between the pairs

In the four taxon problem, seven classes of SNP distributions can be identified
    (for a detailed analysis, see Notes on Parsimony Analysis)


 Positions 1 - 4 are uninformative:
           They give no information about relationships, because
                   all hypotheses require the same number of changes,
                   so none is more parsimonious than the others.
            Position 1 is invariant, and is the most common type.

Positions 5, 6 & 7 are informative:
         Two taxa share one state, other two share another
                They give information about relationships,
                because one hypothesis requires fewer changes than the others
                  & is therefore more parsimonious than the others

   Position 5 indicates that A & B are most closely related:
      The first hypothesis explains the distribution of SNPs with a single change,
         the latter two require two changes each
     The first hypothesis is a more parsimonious explanation of the data than the others.

By the same logic:
   Position 6 indicates that A & C are most closely related.
   Position 7 indicates that A & D are most closely related.

  Homework: for the three networks above, sketch the changes required by sites of Positions 6 & 7


Figures & Text material © 2024 by Steven M. Carr