Sample DNA data
set coded as binary data
The top
panel shows a typical set of 10 bp DNA sequences
from six individuals in a population. DNA sequence
variants occur at all 10 positions, which are
called Single Nucleotide
Polymorphisms (SNPs) or Segregating Sites.
The middle panel re-codes the DNA sequences
in binary form, in each case taking the state in
Sequence I as 0 and any SNP
as 1. We will assume for the moment
that all SNP changes are from 0 1, and we
call any 0 the ancestral state
and any 1 the derived state. The
bottom panel extracts the binary codes for the
derived sites as shaded "1"for ease of comparison.
Each of the six individuals has a distinct
haplotype, that may be written for example for #III
= 0101010010. [Note that the
three SNPs all involve transversions
(alternative purine /
pyrimidine bases): t/g,
c/a, & a/t].
Figure &Text material © 2022 by Steven M. Carr