Models of Molecular Evolution
Given four
nucleotides A, C, G, & T,
there are (4x4) - 4 = 12 possible
pairwise mutations among them that result in a SNP.
Mutations rates among the four nucleotides can be set in various
ways, based on data and assumptions.
[Left] The
original and simplest model is the Jukes & Cantor (1969)
model, called JC69, which assumes that all nucleotide
frequencies are equal, and all mutations leading to a SNP occur
at the same rate, m. For example, the reciprocal rates AC and CA are equal, and equal to AG.
At the time it was introduced, there were few data to suggest
otherwise, and more important, the model was mathematically and
computationally simple prior to the days of PCs.
[Right] Given the
availability of data and increased understanding of molecular
evolution, it became apparent that nucleotide frequencies in any
one DNA strand are unequal, that mutation rates of each
nucleotide are unequal, and
in particular that transitions (AG
and CT)
are much more frequent than transversions (all other
pairwise mutations) by a factor K. The Hasegawa,
Kishino, & Yano (1985) model, call HKY85,
incorporates all these factors. In the last column, for example,
the mutation rate of any nucleotide A, C, or G
to T is the same (
[Below] As
computational power and extensive data became available, it is
now possible to construct a universal model, called the General
Time Reversible (GTR) model, which allows all
available information to be incorporated into any particular
evolutionary investigation. Estimates of mutation rates are
calculated from the data themselves. In the last column, for
example,
T
in the HKY95 model is weighted by three different
nucleotide-specific factors, ,
, where
incorporates the transition bias. The probability that T
remains unchanged is also explicitly calculated, as the negation
of the sum of probabilities in the last row that it does
change.
where there are six distinct reciprocal mutation
rates: